Representation Theory
This page intentionally left blank
Representation Theory
Edwin Williams
The MIT Press Cambr...
64 downloads
1130 Views
2MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Representation Theory
This page intentionally left blank
Representation Theory
Edwin Williams
The MIT Press Cambridge, Massachusetts London, England
6 2003 Massachusetts Institute of Technology All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher. This book was set in Times New Roman on 3B2 by Asco Typesetters, Hong Kong, and was printed and bound in the United States of America. Library of Congress Cataloging-in-Publication Data Williams, Edwin. Representation theory / by Edwin Williams. p. cm. — (Current studies in linguistics) Includes bibliographical references and index. ISBN 0-262-23225-1 (hc. : alk. paper) — ISBN 0-262-73150-9 (pbk. : alk. paper) 1. Grammar, Comparative and general—Syntax. I. Title. II. Current studies in linguistics series P291 .W54 2002 415—dc21 2002071774
To Daddy
This page intentionally left blank
Contents
Preface
ix
Chapter 9 Semantics in Representation Theory 239
Introduction Architecture for a New Economy
1 References
Chapter 1
Index
Economy as Shape Conservation Chapter 2 Topic and Focus in Representation Theory 29 Chapter 3 Embedding
59
Chapter 4 Anaphora
95
Chapter 5 A/A/A/A
117
Chapter 6 Superiority and Movement
139
Chapter 7 X-Bar Theory and Clause Structure 171 Chapter 8 Inflectional Morphology
199
5
283
275
This page intentionally left blank
Preface
In 1971 I wrote the two required qualifying papers for Ph.D. dissertation work in linguistics. One was about ‘‘small clauses’’—the notion that clause structure has several layers, that syntactic operations are associated with particular layers, and that each layer can be embedded directly, without the mediation of the higher layers. The other proposed that tones in tonal languages compose structures that are independent of segmental or syllabic structure and that a certain kind of mapping holds between the tonal and segmental representations. I guess these were the two best ideas I’ve ever had. After thirty years of trying to bring something better to light, I have given up and have determined instead that my further contribution will be to combine them—if not into one idea, then at least into one model of the linguistic system. That is what I try to do in this book. The two ideas take the following guise: (1) syntactic economy is actually shape conservation (here I return to the idea from tonal systems that grammar involves not one complex representation, but two simple ones put into a simple relation to one another), and (2) di¤erent clausal types can be embedded at di¤erent levels (the Level Embedding Conjecture— an implementation of the ‘‘small clause’’ idea). In fact, though, when I put those two ideas together, a third popped out that isn’t inherent in either. It’s this third idea that is responsible for the sharpest new predictions in the book: the generalization of the A/A system to A/A/A/A . . . , which may be viewed as an n-ary generalization of the binary structure of the NP Structure model proposed in Van Riemsdijk and Williams 1981. So this book also brings forward a strand of my collaboration with longtime friend Henk van Riemsdijk. Most of the ideas in this book have been presented in series of four to five lectures or in one-week summer school courses: in Lesbos (1999), Plovdiv (1999), and Vienna (1996–2001), and at UCLA (1997), Univer-
x
Preface
sity of British Columbia (1998), LOT (2001), and University College London (2002). Other parts were presented in multiple meetings of the Dutch/Hungarian verb movement study group in Wasenaar, Pecs, Budapest, and Otteveny, in the years 1997–2001. I have benefited particularly from the extended contact with an audience that such series a¤ord. I have received encouragement in developing this book from Peter Ackema, Misi Brody, Memo Cinque, Christine Czinglar, Henry Davis, Rose-Marie De´chaine, Marcel den Dikken, Hans-Martin Ga¨rtner, Jane Grimshaw, Yosef Grodzinsky, Catherine Hanson, Steve Hanson, Marika Lekakou, Mark Liberman, Ad Neeleman, Andrew Nevins, Øystein Nilsen, Jean-Yves Pollock, Martin Prinzhorn, Henk van Riemsdijk, Dominique Sportiche, Tim Stowell, Peter Svenonius, Anna Szabolcsi, Kriszta Szendro˝ i, and Martina Wiltschko. I do heartily thank Anne Mark for applying the Jaws of Life to the carwreck of a manuscript she got, and I won’t let her edit just this one sentence so the reader may understand exactly what there is to be grateful to her for and why.
Introduction Architecture for a New Economy
Opus ultra vires nostras agere praesumsimus.
The work reported here brings to light two main findings: first, when syntax is economical, what it economizes on is shape distortion rather than distance; and second, this new notion of economy calls for a new architecture for the grammatical system, and in fact a new notion of derivation. For example, the theta structure on the left in (1) has the same shape as the Case structure on the right. (1) [agent [predicate theme]] ‘ [nominative [Case-assigner accusative]] The two structures are isomorphic in an obvious sense. I will speak in this book of one structure as representing another structure if it is isomorphic to it, and I will use the wavy arrow to symbolize this type of representation. Sometimes one structure will be said to represent another even if not isomorphic to it, so long as it is nearly isomorphic to it, and nothing else is closer to it. It is in this sense that syntax economizes on, or tries to minimize, shape distortion. I will present evidence that this gives a better account of economy than distance minimization principles like Shortest Move. The issue can become subtle, as each theory can be made to mimic the other; in fact, I will argue that some of the uses of distance minimization economy in the minimalist literature are transparent contrivances to achieve shape conservation with jury-rigged definitions of distance. The need for a new architecture should be evident from (1). In order to say that a theta structure is isomorphic to a Case structure, we need to have the two structures in the first place. The two structures in (1) have no standing in standard minimalist practice: there is no theta structure that exists independent of Case structure; rather, Case and theta are two
2
Introduction
parts, or de facto ‘‘regions,’’ of a single structural representation of a clause, the notion of clause that began with Pollock 1989 and has been elaborated in functional structure sequencing labs around the world. The model in which a principle of shape conservation will fit most naturally is one in which the several di¤erent aspects of clausal structure are characterized as separate ‘‘sublanguages’’ (to anticipate: Theta Structure (TS), Case Structure (CS), Surface Structure (SS), Quantification Structure (QS), Focus Structure (FS)). Then the syntax of a sentence will be a collection of structures, one (or more; see chapter 3) from each of these sublanguages, and a set of shape-conserving mappings among them. In this sense, then, a new economy (shape conservation) calls for a new architecture ({TS, CS, SS, QS, FS}). The new architecture o¤ers a new style of clausal embedding that has no analogue in standard minimalist practice: the Level Embedding Conjecture of chapter 3, a scheme that tightly fixes the relation among locality, reconstructivity, and target type for syntactic relations in a way that I think is not available in any other model of grammar (see below for definitions of these terms). The new architecture requires, and in fact automatically provides, a generalization of the A/A distinction to an A/A/A/ A . . . distinction to derive these correlations. I have called the theory Representation Theory to put the notion of economy at the forefront: a Case structure ‘‘represents’’ a theta structure it is paired with, and the essence of representation is isomorphism. So, syntax is a series of representations of one sublanguage in another. Chapter 1 develops some analyses in which shape conservation is manifestly implicated in the domains of lexical structure, compound structure, bracketing paradoxes, and Case-theta relations. These serve as a basis for framing a general theory according to which syntax consists of the sublanguages mentioned earlier, with the representation relation holding among them. The Mirror Principle is viewed not as a principle but as an e¤ect that arises automatically whenever two di¤erent sublanguages each represent a third. Chapter 2 applies the model to scrambling and its relation to topicalization, scope, and focus, using the concept of shape conservation to reanalyze these domains. Known properties are shown to follow from conflicting representation requirements, and language di¤erences are analyzed as di¤erent choices in resolving such conflicts. Chapter 3 defines di¤erent kinds of embedding for each of the sublanguages. At the two extremes are clause union embedding in TS and the
A New Economy
3
isolating, nonbridge verb embedding in FS; intermediate-sized clauses are embedded in the intervening levels. The Level Embedding Conjecture (LEC) says that the di¤erent clause types are not all embedded at the same level; rather, each type is embedded at the level at which it is defined. This leads to derivations quite di¤erent from those generated in other known models of syntax. A generalized version of the Ban on Improper Movement follows from this architecture. Chapters 4–6 explore consequences of the LEC proposed in chapter 3. Three characteristics of a rule are its locality (its range), its reconstructivity (for a movement (or scrambling) rule, which relations are computed on its input and which on its output), and its target (the type of element— A-position, A-position, A-position, and so on—it targets). RT with the LEC automatically fixes the connections among them, or correlates them (I will thus refer to LRT correlations), enabling us to answer questions like ‘‘Why does long-distance scrambling reconstruct for binding theory, but not short-distance scrambling?’’ and generalized versions of such questions. Chapter 4 defines di¤erent kinds of anaphors for each sublanguage; tight ‘‘coargument’’ anaphors are defined at TS, and long-distance anaphors at SS. The theory draws a connection between the locality of an anaphor and the type of antecedent it can have, where the types are ‘‘coargument, A position, A position, . . . ,’’ in line with the LRT correlations of chapter 3. Chapter 5 develops the empirical consequences of the generalized notion of the A/A relation that flows from the LEC, and the resulting generalized notion of reconstruction. Essentially, every pair of sublanguages in a representation relation can be said to give rise to a di¤erent A/A distinction and a di¤erent level of reconstruction. Chapter 6 draws a distinction between movement, as classically understood, and misrepresentation, as defined here. Under special circumstances an element might seem to have moved because it occurs in a structure that stands in a representation relation with another structure it is not strictly isomorphic to. I argue that classical movement does not reduce to misrepresentation, and in fact both are needed. Classical wh movement, for example, is a part of the definition of the sublanguage SS and does not arise through misrepresentation. In particular, I argue that parallelism e¤ects observed in multiple wh movements are not the same kind of thing as the parallelisms that motivate shape conservation and that they appear to be so only for the simplest cases.
4
Introduction
Chapters 7 and 8 develop the RT account of phrase structure and head movement. Chapter 7 develops an account of X-bar theory in which a lexical item directly ‘‘lexicalizes’’ a subsequence of functional structure; it then defines the notion of X-bar category in syntax implied by this notion of what a lexical item does. It is a consequence of the A/A generalization of previous chapters that Relativized Minimality must be a misgeneralization, in that it attempts to subsume head movement under a general theory of movement. Chapter 7 argues that head movement is not movement, but part of the X-bar calculus, and its locality follows from the laws of X-bar theory, not movement theory. Chapter 8 explains how such lexicalized subsequences can be spelled out morphologically. Representation is argued to directly derive the Mirror Principle, with a strict separation of syntax and morphological spellout. A model of inflectional morphology is developed, a combinatorial system called CAT, which predicts a precise range of possible realizations of a set of universal inflectional elements; those possible realizations are compared with known facts. The mechanism is also applied to verb cluster systems and is proposed to be the underlying syntax of the recursive argument-of relation, wherever it occurs. Chapter 9 develops some preliminary ideas about how semantics must be done di¤erently in RT. Semantic compositionality must be rethought in view of the syntactic architecture; it becomes less likely that there can be a single representation of linguistically determined meaning. Chapter 9 also elaborates the notion of semantic value defined at each level, and it seeks to explicate the di¤erences among types of anaphora (pronominal anaphora, ellipsis, anaphoric destressing) in terms of these di¤erent kinds of value.
Chapter 1 Economy as Shape Conservation
I begin by exploring a problem with the usual solutions to bracketing paradoxes. The solution to this problem leads to a new principle of economy, Shape Conservation, which shows itself capable of replacing the more familiar economy principles. I fashion a new theoretical architecture to maximize the empirical scope of the principle. Linguists have distinguished two types of economy, local and nonlocal, to use Collins’s (1996) terminology—that is, economy that compares derivations and economy that does not. Although research seems to have moved away from nonlocal economy, the principle studied here is nonlocal, and transderivational. It is sometimes suggested that computational considerations weigh against nonlocal economy, but I am personally willing to put such considerations aside while I try to figure out what the overall organization of the grammatical systems should be. The computation would seem to reduce to a metric of tree similarity, which the considerations presented in this book delimit somewhat, but do not fully determine. 1.1
Bracketing Paradoxes
First, the problem with bracketing paradoxes. A bracketing paradox is a situation in which two di¤erent considerations about the structure of a form lead to di¤erent conclusions. Usually, but not always, one consideration is syntactic and the other semantic. Bracketing paradoxes are generally dispelled by some kind of syntactic restructuring operation. My first point is that any such restructuring must be inhibited by the existence of other derivations and forms, and that the relation of the restructuring to these other derivations and forms is ‘‘economical.’’
6
Chapter 1
The phrase beautiful dancer is a typical example of a bracketing paradox. The phrase is famously ambiguous, having the meanings ‘beautiful one who dances’ and ‘person who dances beautifully’. The two meanings can be represented as in (1a). The first reading is easy to get, but the second is a bracketing paradox. (& and ~& stand for ambiguous and nonambiguous, respectively.) (1) a.
&a beautiful [dance -er] i. beautiful [-er [dance]] ii. [-er [beautiful dance]] b. ~&a beautiful [person who dances] c. a person who [dances beautifully] d. *a [beautiful dance] -er
If (1ai) is the logical structure of (1a), and if modification is restricted to sisters, then (1a) should have only the meaning ‘beautiful one who dances’, because beautiful is sister to an expression headed by -er, which means ‘one who’. But this leaves no room for (1aii), because that meaning, ‘one who dances beautifully’, has a logical structure that is athwart the structure of (1a). Now, we could write a restructuring rule that would relate (1aii) to (1a), thus making it ambiguous, on the assumption that the relation of (1a) to (1ai) is transparent and will arise in any case. But a problem then arises for (1b). If we write the restructuring rule for (1a) in a completely general form, it will most likely apply to (1b) as well, making it ambiguous too; but it is not. Then why is it not? The idea I would like to explore, or exploit, is that (1b) is not ambiguous because the ‘‘paradoxical’’ branch of the ambiguity is already covered by a di¤erent form, namely, (1c), and (1c) fits the meaning better, that is, more transparently. In other words, (1c) blocks (1b) from having its nontransparent meaning. By contrast, the right (transparent) form for the other, nontransparent meaning of (1a) is (1aii), which cannot be generated. So what we have here is a principle that says, ‘‘Use the right form, unless there isn’t one. If there isn’t one, it’s OK to use a form that doesn’t match the meaning.’’ Of course, the failure of (1b) to mean (1c) could be explained di¤erently; it could be taken to represent an island or locality condition on the restructuring operation, for example. But further cases, such as compounding, suggest that this view is wrong. It is a stricture of form that gives rise to the gap. We expect then that if there is no stricture of form, there will be no gap, and consequently no
Shape Conservation
7
bracketing paradox. The system of English compounds is striking in its lack of bracketing paradoxes. (2) a. kitchen [TOWEL rack] 0 b. [KITCHEN towel] rack c. ‘‘[x [ y z]] means [[x y] z] only if [x [y z]] is not generable’’ For example, we have the famous compounds in (2a) and (2b), which have the meanings and accent patterns indicated. Each structure determines a di¤erent meaning and a di¤erent pronunciation; therefore, the meanings and pronunciations and structures are in one-to-one correspondence, and we can say that in each case the structure ‘‘mirrors’’ or ‘‘represents’’ the meaning perfectly. Importantly, (2a) is unambiguous and cannot have the meaning that (2b) has. The question is, why isn’t the restructuring mechanism for bracketing paradoxes, whatever it is, applicable here? Why can’t the form in (2a) (and its predictable pronunciation) be restructured so it is semantically interpreted as though it were like (2b)? In light of the examples in (1), we may now attribute the lack of ambiguity in (2a) to the existence of the form (2b) itself; the other meaning that (2a) would have is the one that (2b) represents directly. The reason there are no bracketing paradoxes in the compounding system is that the ‘‘right’’ structure is always generable; this is expressed in (2c). And the reason for that is that there are only the barest restrictions on the syntax of N-N compounds—any pair of nouns can be concatenated, bar none. It is only where there is some stricture of form, as in (1d), that bracketing paradoxes can arise. The rest of the book develops this idea, exercised across the whole of syntax, and the architecture that syntactic theory and syntactic derivation must have in order for this account of bracketing paradoxes to work. That language should seek isomorphic matches between related structures, and accept nonisomorphic matches only when isomorphic matches are missing, is really an application of Pa¯nini’s principle, ‘‘Use the most ˙ specific applicable form.’’ The isomorphic form is simply the most specific applicable form, and distorted forms are available only when the isomorphic form is not. Shape Conservation thus turns Pa¯nini’s principle ˙ into the economy condition governing syntax. (For further thoughts on this, see Williams 1997, where it is shown that even economy-as-distanceminimization can be construed as an application of Pa¯nini’s principle.) ˙ I will now expand the scope of this kind of treatment somewhat. Adverbial modification also manifests bracketing paradoxes. As a sentence
8
Chapter 1
adverb, probably must modify a tense, or some higher sentence operator. Completely, on the other hand, must modify the VP. (3) shows how these sort out: in (3a) and (3b) Tense and V are separate, so the two adverbs occupy separate positions. But what happens when they are not separate, as in (3c,d)? (3) a. John probably was completely cleaning it out. b. *John completely was probably cleaning it out. c. John probably [clean þ ed ] it out. (probably [-ed [clean]]) d. John completely [clean þ ed] it out. (??completely [-ed [clean]]) (3c) poses no problem: if Tense is the exterior element of V þ ed, then probably can be seen as modifying it directly. But then (3d) is a bracketing paradox: past tense intervenes between the adverb completely and the verb it is meant to be modifying. So, completing the analogy with the previous examples, we can say that (4b) can have the meaning in (4a), because (4a) itself is not generable. (4) a. *[completely clean]V -ed b. completely [clean -ed]V This gives a di¤erent view of modification than we would expect to have in the exploded-Infl clause structure proposed by Pollock (1989). In the Pollock-style clause structure this particular bracketing paradox does not arise. Tense is a separate element; both (3a) and (3b) have the structure in (5), and each adverb is adjoined to its proper modifiee. Then V moves to Tense covertly. (5) [probably T [completely [V NP]] A fully lexicalist account of inflection, where functional structure is not part of clause structure directly but is rather part of the internal structure of lexical items, will always involve us in these sorts of bracketing paradoxes, and so the viability of the lexicalist account depends on gaining some understanding of how bracketing paradoxes work. My first guess is that bracketing paradoxes arise when the ‘‘best match’’ for a given structure is not available for some reason, so the ‘‘next best match’’ must be used. In a lexicalist account of inflection, functional structure will be visible only at the ‘‘joints’’ between words, so any case in which an adverb modifies an interior element will be a bracketing paradox. Chapters 7 and 8 pursue a lexicalist model of inflection in RT.
Shape Conservation
1.2
9
The Meaning of Synthetic Compounds
The notion of representation, as understood here, can also be applied to the interpretation of compound structures. The problem to be solved arises precisely because of the extreme productivity of compounding. Any two nouns can be put together, and some meaning that connects them can be concocted, the only inhibition being that the head, in the normal case, must count as setting the ‘‘major dimension’’ in determining the meaning of the compound; the nonhead then provides discrimination within the major dimension. So my students have no trouble thinking of endless lists of possible relations that could hold between the two randomly selected nouns biography and bicycle of the following compounds: (6) a. biography bicycle: a bicycle on which biographies are inscribed, a bicycle on which manuscripts of biographies are messengered in a large publishing house, etc. b. bicycle biography: a biography written while touring on a bicycle, the biography of a bicycle, etc. Although there are quite narrow rules for pronouncing compounds, it would seem we can be no more precise about how to determine their meaning than to say, ‘‘Find some semantic relation that can hold between the two elements.’’ This is the general understanding of what have been called root compounds. It has also been suggested that there is a substrain of compounds, complement-taking deverbal nouns, that follows a more precise rule. (7) a. book destroyer b. church goer c. *goer If the root rule can compose new forms only out of existing forms, then the nonexistence of (7c) is cited as evidence that (7b) cannot arise simply by applying that rule; hence, a special rule for these synthetic compounds is postulated (Roeper and Siegel 1978). The synthetic rule is a specific rule that manipulates the thematic structure of a lexical item, adding an element that satisfies one of its theta roles. For example, starting with the verb go, which takes a goal argument, this rule adds church to it to satisfy that goal argument in the compound structure. One problem with positing two rules for English compounding, the root rule and the synthetic rule, is that the outputs of the two rules are suspiciously similar: both give rise to head-final structures, with identical
10
Chapter 1
accent patterns. But a much greater problem, and the one I want to concentrate on, is that the output of the synthetic rule is completely swamped by the output of the root rule. Since the root rule says, ‘‘Find some relation R . . . ,’’ with no imaginable restrictions on what R can be, and since ‘‘x is (or ‘restricts the reference of ’) the thematic complement of y’’ is some such R, what is to stop the root rule from deriving compounds just like the synthetic rule, making the synthetic rule redundant? We might begin by thinking of the connection between the two rules as a ‘‘blocking’’ relationship (i.e., governed by Pa¯nini’s rule): the specific ˙ synthetic rule blocks the more general root rule, in order to prevent the root rule from deriving synthetic compounds. I think the intuition behind this idea is correct, but it raises a telling question that can only be answered by bringing in the notion of representation in the sense developed here. But the first thing to establish is that there really is a problem. Is there anything to be lost by simply giving up the synthetic rule, leaving only the root rule for interpreting compounds? There is at least this: the root rule will not only derive all the good synthetic compounds, but also derive bad ones. Consider a further fact about synthetic compounds, specifically about nominalizations derived from ditransitive verbs like supply: the two theta roles for supply have to be realized in a particular order with the noun supplier. (8) a. army gun supplier b. *gun army supplier Presumably (8a) is the only form generated by the specific synthetic rule, but why can (8b) not then be generated by the root rule? The answer cannot be ‘‘blocking,’’ because the synthetic rule cannot produce (8b), and so the root rule will not be blocked for that case. Apparently army supplier is a decent compound on its own, so the question reduces to this: what is to stop the root rule from composing gun and army supplier as shown in (9) (where R(x; y) is ‘‘ y is (or ‘restricts the reference of ’) the theme argument of the head of x’’)? (9) a. Syntax: gun þ ‘‘army supplier’’ ! gun army supplier b. Semantics: R(army supplier, gun) If such Rs are admitted, and I see no principled way to stop them, then the root rule can derive anything, including ‘‘bad’’ synthetic compounds —a real problem.
Shape Conservation
11
In fact, if any R is allowed, it is not even clear how to maintain the special role of the head in compounds—the right R could e¤ectively reverse the role of head and nonhead. (10) R(H, non-H) ¼ (some) R 0 (non-H, H) In other words, R says, ‘‘Interpret a compound as though the head were the nonhead and the nonhead were the head.’’ This R defeats the very notion of head, as it pertains to meaning. To treat the second problem first: whatever the semantic content of the notion ‘‘head’’ is, it relies on every relation R having an obvious choice about which end of the relation is the ‘‘head’’ end. Semantically, the head is the ‘‘major dimension’’ of referent discrimination. In the ordinary case such as baby carriage the choice is obvious: a baby carriage is not a baby at all, but a type of carriage, subtype baby. But the dvandva compounds show how very slim the semantic contribution of headship can be. (11) a. baby athlete b. athlete baby (12) a. athlete celebrity b. celebrity athlete In each of these the (a) and (b) examples have the same referents: ‘‘things that are babies and athletes’’ or ‘‘things that are athletes and celebrities.’’ But in fact (11b) is somewhat strange, presumably because it implies that babies come in types, one of which is ‘‘athlete,’’ even if it is not obvious why this is less acceptable than the notion that athletes come in types, one of which is ‘‘baby.’’ I think that these are both ‘‘representational’’ questions: what syntactic structures ‘‘represent’’ various semantic structures, where one structure represents another by mirroring its structure and parts. We can turn the concept of head into a representational question, in the following way: (13) Suppose that a. the head-complement relation is a syntactic relation [H C], and b. R is any asymmetric semantic relation {A, B} between two elements. Then how is [H C] to be matched up with {A, B}? The syntactic relation will best ‘‘represent’’ the semantic relation if its asymmetry is matched by the asymmetry of {A, B}—but which identification is the one that can be said to match the asymmetries? This question
12
Chapter 1
can best be answered by first considering the question, what is the syntactic asymmetry itself ? I think the source of the syntactic asymmetry is the syntactic definition of head, which I take to be the following, or at least to have the following as an immediate consequence: (14) [H C] is a (syntactic thing of type) H. That is, syntactically, a unit composed of a head and its complement ([H C]) ‘‘is a thing of the same type as’’ the type of H itself. Phrasing the matter this way, there can be no question which of the two items in (15b) is ‘‘best represented’’ by the form in (15a), namely, (15bi). (15) a. [baby athlete] b. i. ‘‘baby athlete’’ is a thing of the same type as ‘‘athlete’’ ii. ‘‘baby athlete’’ is a thing of the same type as ‘‘baby’’ And likewise for ‘‘athlete baby.’’ Crucially, I am assuming that the representation must match the asymmetry of syntax with some asymmetry in the structure it is representing, as a part of the representation relation. Now let us return to *gun army supplier. By the root derivation mentioned earlier, this form has (among others) a meaning in common with army gun supplier, and the question is how to block that. To apply the logic above, we must assume that there is a theta structure with the form in (16a), but none with the form in (16b). (16) a. [goal [theme supplier]] b. *[theme [goal supplier]] This is a fact about theta structures themselves, not how they are represented. Then we can say that this is best represented by a structure in which the highest N is mapped to the goal, and the next highest N is mapped to the theme, rather than the reverse. (17)
The result is that R can be any imaginable relation; but for a given representation relation, we must choose R so as to maximize isomorphism to the represented structure. This is why the root rule appears to be
Shape Conservation
13
constrained by the synthetic rule. A compound does not have to represent a theta relation; but if it does, it must do so in the best possible way. We have seen that there are two ways to think about this, in terms of rules and in terms of representation. The account in terms of rules is insu‰cient in an important way and can be remedied only by reference to something like representation. Therefore, we may as well devote ourselves to solving the problem of representation and in the end be able to forget about the rules. It is tempting to think of the synthetic rule as blocking the root rule, but this does not give a straightforward account of why (8b) is ungrammatical, since the synthetic rule would not derive it anyway. In order to prevent (8b) by rule blocking, we must have recourse to what (8b) is trying to do, and then block it because (8a) does the same thing better. But of course what it is trying to do is to represent (8a), only it does it less well than another form. I don’t see any way around this. 1.3
Case c Theta Representations
I have spoken of one system ‘‘representing’’ another system. I have chosen the word represent purposely to bring to mind the mathematical sense of representation, which involves isomorphism. So, the set of theta structures at the level of Theta Structure (TS) is one system, and the stems and a‰xes of a language are another system, and we can speak of how, and how well, one represents the other. For example, we have the theta structure ‘‘complement-predicate,’’ and this structure is represented by the stem-su‰x structure provided by morphology. Of course, in this case there is a natural isomorphism that relates the two. (18) TS ‘ Morphology TS: Morphology: {complement predicate} ‘ [stem su‰x] (e.g., lique-fy) In the case of TS we (as investigators) are lucky: there are two di¤erent systems that represent TS. One is morphology, or word structure, and the other is Case theory, or, as it will be called here, the level of Case Structure (CS): the system of Case assigners and their assignees, a part of phrasal syntax. These representations are di¤erent: they reflect the di¤erence between a‰x and XP, di¤erences in the positioning of the head, and other di¤erences. But they are the same in their representation of theta structures, so we can learn something about the representation relation by
14
Chapter 1
comparing them. (Later in this chapter, and in more detail in chapter 7, I will derive the Mirror Principle from this arrangement, and in chapter 4, some other consequences.) Throughout this book the wavy arrow (‘ or c) will stand for the representation relation. (19a) diagrams the arrangement under which both morphology and phrasal syntax (specifically, ‘‘Case frames’’ in phrasal syntax) represent theta relations. (19) a. Morphology c TS ‘ CS b. {supply theme} i. ‘ [gun supplier]N ii. ‘ [supply guns]VP c. {{supply theme} goal} i. ‘ [army [gun supplier]]N ii. ‘ [[supply guns] to an army]VP d. {{advise theme} goal} i. ‘ [graduate student [course advisor]]N ii. ‘ *[course [graduate student advisor]]N iii. ‘ [[advise graduate students] about courses]VP iv. ‘ *[[advise courses] to graduate students]VP e. advise: NP aboutP By stipulated convention, the arrow points from the representing structure to the represented structure. (19b) illustrates the simple theta structure consisting of the predicate supply and its theme complement; this relation can be represented by either a compound (N) or a Case frame (VP), as shown. A more complex theta structure, as in (19c), begets correspondingly more complex representations. For (19c) the Case and morphological representations are different, but both are isomorphic to the theta structure, so long as linear order is ignored. In other cases, however, the two representations diverge. For example, if advise takes a theme and a goal, in that order, then the compound seems to be isomorphic to the resulting structure (19di), but the syntactic representation does not seem to be (19diii). And the compound that would be isomorphic to the syntactic representation is ungrammatical (19dii). How can this come about? We have already seen why the compound (19dii) is ungrammatical: there is a better representation of the target theta structure. As for the syntactic representation, suppose that the verb advise is stipulated to have the Case frame in (19e), but not the one that would allow (19div). Then the theta structure in
Shape Conservation
15
(19a) will map, or mismap, onto (19diii), because that is the best available. Hence the divergence between the compound and the Case structure. (19diii) is a misrepresentation of (19d) (and so is a ‘‘bracketing paradox’’), which arises from a perhaps arbitrary stricture in the representing system, the stipulated subcategorization of supply (19e). The exceptional-Case-marking (ECM) construction is another obvious example of a Case-theta misrepresentation. TS provides a representation of the sort given in (20). Now suppose that CS provides the representation indicated, but nothing isomorphic to the theta structure. Then the Case structure will misrepresent the theta structure. This account misses an important fact about ECM—that it is a rare construction—but captures the essential features of the construction itself. (20) ECM as a bracketing paradox in syntax TS: [believe [Mary to be alive]] CS: [[believe Mary] to be alive] Throughout these examples the economy principle at work is this: ‘‘Use the ‘most isomorphic’ structure that satisfies the strictures of the representing level.’’ If only we could specify in a general way what sets of structures are taken to be in competition, we would have a theory. 1.4
Shape Conservation
I think that most of the economy proposals about grammatical structure made during the 1990s can be understood as principles partly designed to aid and abet the kind of shape conservation under discussion here. First of course is the Mirror Principle (Baker 1985), which says that the interior structure of words will mirror the exterior syntactic structure in which the words occur. The Mirror Principle is not really a principle, but a robust generalization that is reflected in di¤erent theories in di¤erent ways. It is implemented in Chomsky 1993, for example, by the algorithm of feature checking, which is stated in such a way that as a verb moves up the tree, one of its features can be checked in syntax only after features more deeply buried in the word have already been checked; this achieves the mirror e¤ect because morphology adds the most deeply embedded features first. This reduces the Mirror Principle to coordinating the two uses of the feature set via a list, or more specifically, a ‘‘stack.’’ A stack gives mirror behavior, but is of course only one way to get it.
16
Chapter 1
My own view is that the Mirror Principle arises from the holistic matching of two structures. Since a list is an ‘‘abstract’’ of a structure, it can serve the same purpose in some circumstances, but only where the list is an adequate abstract of the structure in question. I regard Chomsky’s mechanism as an artifice that mimics structural isomorphism for simple cases—essentially right-linear structures, which are equivalent to lists, in the sense that there is an obvious way to construct a list from a rightlinear structure and vice versa. As mentioned earlier, I take the Mirror Principle to be the result of having two systems that represent one and the same theta system, in the sense of isomorphic representation. (21) Mirror Principle morphology c theta roles, inflectional elements ‘ Case system So, just as there are derivational pairs that mirror each other (22a,b), there are also inflectional pairs that do the same thing (22c,d). (22) Derivation a. [can [swim]VP ] b. [[swim]V able] Inflection c. [didT; VP [see]VP ] d. [[see]V -edT; V ] In the same vein, Chomsky’s (1993, 1995) definition of equidistance can be seen as a principle that promotes shape conservation, though without explicitly saying so. The question he posed, to which equidistance was the answer, is, why does the object move to AgrO and the subject to AgrS , and not vice versa? (Here I use Chomsky’s (1993) terminology; the problem remains in more recent Agr-less theories.) Chomsky engineers a solution to this problem in the definition of equidistance, and as a result, the permitted combination of movements is the familiar pair of intersecting movements. (23)
Verb movement ‘‘extends the domain’’ of the lowest NP, as domain is defined in terms of head chains. With the domain of the lower NP ex-
Shape Conservation
17
tended in this way, the two NPs in (23) are in the same domain and hence, by definition, equally distant from anything outside that domain; hence, they are equally eligible to move outside that domain; hence, the subject can move over the object without violating economy conditions, and the intersecting derivation results. A ‘‘shortest derivation’’ principle rules out the other, nesting derivation. The odd result is that although the economy conditions are distance minimizing, distance itself is never defined, only equidistance. I believe this is a clue that the result is artificial. Intersecting paths are not what previous work has taught us to expect from movement. (24a) illustrates the famous intersecting pair of tough movement and wh movement; as is evident, the intersecting case is much worse than the nesting case (24b) (Fodor 1978). (24) a. *Which sonatas is this violin easy to play tsonatas on tviolin ? b. Which violin are these sonatas easy to play tsonatas on tviolin ? So the intersecting movement of subject and object is mysterious. Intersection might be an illusion arising from the analytic tools and not from the phenomenon itself. Intersection only arises if two items are moving to two di¤erent positions in the same structure. But suppose that instead of moving both subject and object up a single tree, we are instead trying to find their correspondents in a di¤erent tree altogether—the sort of operation illustrated in (25). Then there is no intersection of movement; what we have instead is a setup of correspondences between two structures that preserves the interrelation of those elements (the subject and object). (25)
In standard minimalist practice, A would be embedded beneath B, and movement would relate the agent to nominative, and the theme to accusative. But in RT these relations are a part of the holistic mapping of TS (containing A) to CS (containing B). An examination of Holmberg’s (1985) generalization leads to similar conclusions: it is better seen as a constraint on mapping one representation into another, than as a constraint on the coordinated movements, within a single tree, of the items it pertains to (verb and direct object).
18
Chapter 1
The generalization says that object shift must be accompanied by verb movement: if the object is going to move to the left, then its verb must do so too. The following are Icelandic examples in which the verb and/or the direct object can be seen to reposition itself/themselves leftward over negation: (26) a.
b.
aD Jo´n keypti ekki bo´kina that Jon bought not the-book ‘that Jon didn’t buy the book’ aD Jo´n keypti bo´kina ekki tV tNP
Jo´n hefur ekki keypt bo´kina. Jon has not bought the-book ‘that Jon didn’t buy the book’ b. *Jo´n hefur bo´kina ekki keypt tNP .
(27) a.
V neg NP
V NP neg aux neg V NP
aux NP neg V
(26) shows that the object can appear on either side of negation, so long as both are to the right of the verb. (27) shows that when the verb is to the right of negation, the object cannot cross negation, even though it could in (26). Clearly, what is being conserved here is the relation of the verb to the object. (This, by the way, is not Holmberg’s original proposal, but one derived from it that many researchers take to be his proposal. In fact, he proposed that the object cannot move at all unless the verb moves, a weaker generalization; it remains an empirical question which version is the one worth pursuing.) There are various proposals for capturing Holmberg’s generalization, including Chomsky’s (1995) idea that the D and V ‘‘strength’’ features of the attracting functional projection must be coordinated—that is, both strong or both weak. This won’t really work, because if the V and the direct object are attracted to the same functional projection, they will cross over each other, and this is exactly what is not allowed. C
(28) ‘‘ . . . AgrO is {strong [D-], strong [V-]}.’’ (Chomsky 1995, 352) strong DAgrO strong VAgrO (29)
In order to capture the strong and most interesting form of Holmberg’s generalization (i.e., in order to guanantee that the object cannot cross the
Shape Conservation
19
verb), Chomsky’s account must be accompanied by a further stipulation that the V obligatorily moves to Tense. But I think that further facts demonstrate the insu‰ciency of this approach. When there are two objects, they cannot cross over each other, though the first can move by itself. (30) a. NP b. NP c. NP d. *NP
V V V V
ekki NP1 NP2 NP1 ekki t1 NP2 NP1 NP2 ekki t1 t2 NP2 NP1 ekki t1 t2
Clearly, coordinating attraction features will not work here either. What is obviously going on is that any set of movements is allowed that does not perturb the interrelation of V, NP1 , and NP2 . Again, a holistic principle of Shape Conservation would seem to go most directly to the heart of the problem. A further mystery for the standard view arises from the fact that Holmberg’s generalization does not hold for V-final languages like German. (31) Sie hat Peter gestern gesehen. she has Peter yesterday seen ‘She saw Peter yesterday.’ In (31) the object has moved leftward over the adverb without an accompanying movement of the verb. If the view I am suggesting here is correct, Holmberg’s generalization does not hold because the leftward movement of the object in Germanic (over an adverb) does not change the relation of object to verb—the original order is conserved. In particular theories shape conservation shows up in particular ways. In hyper-Kaynian theories (Antisymmetry theories with massive remnant movement) there is a signature derivation of shape-conserving mappings. The key is systematic remnant movement—namely, remnant movement resulting automatically from the fact that a phrase is a remnant. All transformational theories of grammar have countenanced remnant movement (see chapter 5 for discussion): NP movement can give rise to an AP with a gap in it (32a), and then that AP can be displaced by wh movement (32b). (32) a. John is [how certain t to win]AP b. [how certain t to win]AP is John
20
Chapter 1
But in such a case the two movements are triggered by di¤erent things (Case for NP movement and wh requirements for wh movement); and in fact the movements can occur alone, and so are not coordinated with one another. But in hyper-Kaynian remnant movement the movement of the remnant and the movement that creates the remnant are keyed to each other in some way. There are several ways to implement this (one could propose that both movements are triggered by the same attractor, or some more complicated arrangement), but in any such arrangement the movements will always be paired. Now suppose we find evidence in RT for a shape-conserving ‘‘translation’’ of structures in one level (L1 ) to structures in another (L2 ), as shown in (33) (where the lines are points of correspondence under the shape-conserving mapping). (33)
We can mimic this behavior in Antisymmetry as follows. First, the derivation concerns a single structure, rather than the pair of structures in (33); that structure is the result of embedding FL001 as a complement of FL2 . Three movements are needed to map the material in the embedded (FL1 ) structure into positions in the higher (FL2 ) structure in shape-conserving fashion. We therefore need four specifiers, of F0 , F1 , F2 , and F3 (shown in (34), with F0 at the very top not visible). F3 in (34) corresponds to FL001 in (33), and SpecF3 corresponds to SpecFL1 . Instead of mapping from one level to another, as in RT, we move everything in F3 up the tree into the region of F1 and F0 . In order for these movements to achieve shape conservation, a minimum of three moves are needed, two movements of SpecF3 and one of F3 itself, in the following order: (a) movement of SpecF3 , making F3 a remnant; (b) movement of that remnant to a Spec higher than the one SpecF3 was moved to; (c) a second movement of SpecF3 (to SpecF0 ), to ‘‘reconstitute’’ the original order of SpecF3 and the rest of F3 .
Shape Conservation
21
(34) Achieving shape conservation in Antisymmetry
What is conserved is the order, and the c-command relations, among the elements of F3 . Of course, F3 itself is not conserved, having been broken into parts, but since the parts maintain their order and c-command relations and are therefore almost indistinguishable from an intact F3 , the result does deserve some recognition as exemplifying shape conservation. I believe there is no simpler set of movements in Antisymmetry that could be called shape conserving. For this reason, I find it telling that derivations such as (34) abound in the Antisymmetry literature. It suggests to me that there is something fundamental about shape conservation. Since Antisymmetry was not built to capture shape conservation directly, it can only do so in this roundabout way—yet this roundabout derivation occurs on every page. Of course, not all derivations in Antisymmetry instantiate (34), only the shape-conserving ones. After all, things do get reordered in some derivations, in all accounts. But it will still be suspicious if the ‘‘nothing special is happening’’ derivation in Antisymmetry always instantiates (34). It suggests to me that (33) is right. Another principle with shape-conserving character was a principle of Generative Semantics, where interpreted structure was deep structure, and surface structure was the endpoint of derivation in a completely linear model. The gist of it is this: if Q1 has scope over Q2 in interpreted structure, then Q1 c-commands Q2 in surface structure (see, e.g., Lako¤
22
Chapter 1
1972). Ignore for now that the principle is false to certain facts, such as the ambiguity, in English, of sentences with two quantified NPs (e.g., Everyone likes someone)—it represents a real truth about quantifiers, and I will in the end incorporate it into RT directly as a subinstance of the Shape Conservation principle. This principle has been reformulated a few times—for example, by Huang (1982, 220), (35) General Condition on Scope Suppose A and B are both QPs or Q-expressions, then if A c-commands B at SS, A also c-commands B at LF. and by Hoji (1985, 248). (36) *QPi QPj tj ti where each member c-commands the member to its right. Probably related is the observation widely made about a number of languages that if two quantifiers are in their base order, then their interpretation is fixed by that order; but if they have been permuted, then the possibility of ambiguity arises. All of these versions of the principle achieve the same thing: a correspondence between (something close to) an interpreted structure and (something close to) a heard structure. In fact, the correspondence is a sameness of structure, and so encourages us to pursue the idea of a general principle of Shape Conservation. Lako¤ ’s and Huang’s versions are transparently shape-conserving principles. Hoji’s is not, until one realizes that it is a representational equivalent of Huang’s and Lako¤ ’s. Fox’s (1995) results concerning economy of scope can be seen in the same light. Finally, Shape Conservation bears an obvious relation to ‘‘faithfulness to input’’ in Optimality Theory and to the f-structure/c-structure mapping in Lexical-Functional Grammar. I will comment further on the relation between RT and these other theories in chapter 3. I have by now recited a lengthy catalogue of shape-conserving principles in syntax: the Mirror Principle, equidistance, Holmberg’s generalization, various scope principles, faithfulness. I omitted Emonds’s Structure Preservation despite its similarity in name, because it governs individual rule applications and so lacks the holistic character of the other principles. But I would add one more to the list: to my knowledge, the first shapeconserving principle in the tradition of generative grammar was proposed
Shape Conservation
23
in Williams 1971b, namely, that tonal elements (e.g., High and Low) are not features of vowels or syllables, but constitute a representation separate from segmental structure, with its own properties, and that that separate representation is made to correspond algorithmically to segmental structure, also with its own, but di¤erent, structure. Tonal Structure, at least as I discussed it then, was rather primitive, consisting of a sequence of tones (L, H) grouped into morphemes; and this structure was mapped to another linear representation, the sequence of vowels (or syllables) of the segmental structure, in a one-to-one left-to-right manner, in a way that accounted for such phenomena as tonal spreading. Clearly, there is a shape-conserving principle in this, even if I did not explicitly identify it as such; to use the terminology of this book, after the mapping Syllable Structure represents Tonal Structure, in that elements of Tonal Structure are put into one-to-one correspondence with elements of Syllable Structure, and the properties of Tonal Structure (only ‘‘x follows y,’’ since it is a list) are preserved under the representation. 1.5
The Representation Model
If there are systematic circumstances in which grammar seems to want to preserve relations between elements, we might consider building a model from scratch that captures these directly and without contrivance. Suppose we analyze the grammatical system into several distinct components, each of which defines a set of structures (a sublanguage), and which are related to each other by shape-conserving mappings. The syntax of a clause will then be a mapping across a series of representations, from Theta Structure to Case Structure to Surface Structure, and so on. (37)
24
Chapter 1
AS is a partial phonological representation with sentence accent structure assigned. Its role will be developed in chapter 9. To compare with the more standard model, we can see this series of structures (38) as a decomposition of the standard clause structure (39), with the following correspondences: what is done by structural embedding in the standard theory is done by the representation relation in RT; and what is done by movement up the tree in the standard theory is done by isomorphic mapping across this series of representations in RT. (38)
(39)
An immediate consequence of this decomposition is that in RT there can be no such thing as an item being left far behind—everything that is going to be in the clause must make it to the last representation of (38), which would be equivalent to every NP moving to the very top shell of (39). The single deep tree of the standard Pollock-style minimalist theory, on the other hand, allows such ‘‘deep stragglers.’’ Although certain widely accepted accounts of some constructions (e.g., transitive expletive constructions) entail the surface positioning of NPs in original theta positions, it seems that the trend has instead been more and more toward analyses in which NPs never appear in deep positions. To the extent that this trend is responding to some feature of reality, I would say that it confirms RT, in which any other arrangement is not just impossible, but literally incoherent. Another way to contrast the two theories is in how semantics is done. Semantics in the ramified Pollock- and Cinque-style model can be com-
Shape Conservation
25
positional, in the usual sense; but semantics in RT is ‘‘cumulative,’’ in a sense spelled out below and in chapter 9. ‘‘Embedding’’ here is not structural embedding, but ‘‘homomorphic’’ embedding: TS is ‘‘embedded’’ in FS by a series of shape-conserving mappings. Not everything that is a movement in the standard theory will become an interlevel (mis-)mapping in RT. I have already remarked that wh movement is a movement within a level, presumably SS. An interesting pair in this regard is short-distance and long-distance scrambling. Short scrambling might best be modeled as a mismapping between CS and SS, whereas long scrambling might best be treated like wh movement, or perhaps a ‘‘higher’’ mismapping (SS ‘ FS, for example). The di¤erent behavior of short and long scrambling with respect to binding theory and reconstruction should follow from this distinction. (See section 3.1 for details, and chapters 4 and 5 for generalized applications of the chapter 3 methodology.) Many questions about this model and its di¤erences from standard models are still unaddressed. Though most of them will remain so, I will take up two fundamental questions in chapters 3 and 4. First, there is the issue of embedding: how is clausal embedding accomplished in RT? Embedding could have worked something like this: elements defined in ‘‘later’’ systems (QS, FS, etc.) are ‘‘rechristened’’ as theta objects, which can then enter into theta relations in TS. This account would preserve the obvious relation to standard minimalist practice and its antecedents back to Syntactic Structures (Chomsky 1957). But in chapter 3 I will try out a di¤erent view, with surprisingly di¤erent consequences: embedding of di¤erent subordinate clause types happens at di¤erent levels in RT, where the di¤erent clause types vary along the dimension of ‘‘degree of clause union.’’ The principle for embedding is, ‘‘Embed at the level at which the embedded object is first defined’’ (the Level Embedding Conjecture of chapter 3). For small embeddings, like that found in serial verb constructions, the level is TS; but for tensedclause embedding, the level is SS. Second, how is semantic interpretation done in this model? Each of the levels is associated with a di¤erent sort of value, and in chapters 4 and 9 I will try to specify what these values are. Perhaps the most important di¤erence between RT and the standard model, then, is that there is not one single tree that represents the meaning; TS represents theta structures, QS scope relations, FS information structure of the kind relevant to focus, and so on. The structure of a sentence consists of a set of structures, one
26
Chapter 1
from each of these components, with the shape-conserving mapping holding among them. Clearly, the meaning is determinable from these representations; for example, it would be trivial to write an algorithm that would convert such representations into classical LF structures. But it is not the case that linguistic meaning can be identified with one of these levels. To borrow a philosopher’s term, one might say that linguistic meaning is supervenient on these representations (if it is not identical with them), in that any di¤erence in the meaning of two sentences will correspond systematically with some di¤erence in their representation structure. Systematicity will guarantee some notion of semantic compositionality. Compositionality will hold within a level, but it will also hold across levels. I am not sure that linguistic semantics needs anything more than this. Having promised to address these two substantive issues in future chapters, I would now like to put aside a concern that I think is overrated. The following sentiment was often expressed to me while I was developing the ideas outlined here: ‘‘You’ve replaced movement governed by distance minimization with holistic mapping between levels governed by shape conservation. But the properties of movement are rather well understood, whereas you can give only the barest idea of what constitutes ‘structure matching’—so the theories aren’t really empirically comparable.’’ My main objection to this is not what it says about my account of shape conservation. I accept the charge. But I must question the claim that there is a notion of movement that is widely accepted, much less understood. If we review the properties of movement, we find that none of them are constant across even a highly selective ‘‘centralist’’ list of works that seek to use movement in significant acts of explanation. What would the properties be? 1. 2. 3. 4. 5. 6. 7.
Is movement always to a c-commanding position? Is movement always to the left? Is movement always island governed? Does movement always leave a gap? Does movement always result in overt material in the landing site? Does movement always move to the top? Is movement always of an XP?
For each of these questions it is easy to find two serious e¤orts at explanation giving opposite answers. For example, in work reviewed in
Shape Conservation
27
chapter 6 of this book, Richards (1997) proposes that some movement does not obey islands (question 3). In addition, Richards proposes that movement is not always to the edge of its domain, but sometimes ‘‘tucks in’’ beneath the top element, to use his informal terminology (question 6). Koopman and Szabolcsi (2000) insist that there is no head movement (question 7). And so on. Movement, then, is a term associated with di¤erent properties in di¤erent acts of explanation, and the intersection of those properties is essentially null. This does not mean that no one who uses the term knows what he or she means by it, only that there is no common understanding. I don’t think that is a bad thing. The di¤erent uses are after all related; for example, although it is perfectly acceptable to build a theory in which movement sometimes leaves a gap, and sometimes leaves a pronoun, it would be unacceptable to use the term movement in such a way that it covered none of the cases of gap formation. So it is not that the term is completely meaningless. But still there is no shared set of properties that has any significant empirical entailments on its own. Someone who is pursuing Antisymmetry, for example, will have a very di¤erent understanding of the term than someone who is not. It is the familiarity of the term itself that gives rise to the illusion that there is a substantive shared understanding of what it refers to. If every linguist had to replace every use of the term movement with the more elaborate syntactic relation with properties P1 , P2 , P3 , P7 , P23 , I think fewer linguists would claim that ‘‘movement is rather well understood,’’ and then some audience could be mustered for notions of syntactic relation for which the term movement is not particularly appropriate.
This page intentionally left blank
Chapter 2 Topic and Focus in Representation Theory
In chapter 1 I made some rather vague suggestions about how Case systems might be seen as ‘‘representing’’ TS, and in doing so gave some idea about how the ‘‘left end’’ of the RT model uses the principle of Shape Conservation. In this chapter I will turn to the other end and show how the same notion can be used to develop an understanding of how topic and focus interact with surface syntax. This chapter is essentially about the interpretive e¤ects of local scrambling. Although English will figure in the discussion, my chief aim will be to explicate, in terms of Shape Conservation, some mainly well known findings about Italian, German, Spanish, and Hungarian having to do with word order, topic, and focus. The interpretive e¤ects of longdistance scrambling, and its place in RT, will be taken up in chapters 3 and 5, where the A/A distinction is generalized in a way that makes sense of the di¤erence between long- and short-distance scrambling. Long and short scrambling pose a special problem for Checking Theory. Checking Theory provides a methodology for analyzing any correlation between a di¤erence in syntactic form and a di¤erence in meaning: a functional element is postulated, one whose semantics determines the di¤erence in meaning by a compositional semantics, and whose syntax determines a di¤erence in form by acting as an attractor for movement of some class of phrases to its position. That is, interpretable features trigger movement. But, as I will show, in the case of focus the moved constituent does not in general correspond to the Focus. It of course can be the Focus itself; but in addition, it can be some phrase that includes the Focus, or it can be some phrase that is included in the Focus. While the first might be (mis)analyzed as a kind of pied-piping, the second makes no sense at all from the point of view of triggered movement. The problem with Checking Theory that will emerge from the following
30
Chapter 2
observations is that it atomizes syntactic relations into trigger/movedelement pairs, whereas in fact the syntactic computation targets structures holistically. 2.1
Preliminaries
I will use Topic and Focus in their currently understood sense: the Topic consists of presupposed information, and the Focus of new information. Elsewhere (Williams 1997) I have developed the idea that Focus is essentially an anaphoric notion and that Topic is a subordinated Focus. I will take this idea up again in chapter 9, but will ignore it until then. In chapter 1 I introduced two sets of structures, QS (¼ TopS) and FS. The properties of these structures and their relation to other structures under Shape Conservation will carry the burden of accounting for the features of topic and focus to be examined here. The di¤erences among the languages to be discussed will be determined by either (a) differences in the rules for forming each structure or (b) di¤ering representational demands (e.g., SS c QS representation ‘‘trumping’’ SS c CS representation in some languages, with SS ‘ FS figuring in in a way to be described). QS represents not only the topic structure of the clause, but also the scopes of quantifiers. The reason for collapsing these two is empirical, and possibly false: wide scope quantifiers seem to behave like Topics, and unlike Focuses. First, languages in which topic structure is heavily reflected in surface syntax tend to be languages in which quantifier scope is also heavily reflected. German is such a language, but English is not. Second, focusing allows for reconstruction in the determination of scope, but topicalization does not. The latter di¤erence has a principled account in RT, a topic explored in chapters 3 and 5. 2.2
The Structure of QS and FS
QS and FS bear representational relations to SS: SS represents QS, and FS represents SS. In this section I will give a rough sketch of these structures, leaving many details to be fixed as analysis demands, as usual. One question to be resolved in establishing the basic notions in this domain is, what is the relation among the semantic notions to be represented (Topic status, wide scope) and the structural predicates precedes and c-commands? Most clearly for adjuncts, relative scope seems to de-
Topic and Focus
31
pend on the stacking relation, not the linear order, if we can rely on our judgments of the following sentences: (1) a. John was there a few times every day. (every > few) b. [[[was there] a few times] every day] c. [[[John was there] every time] a few days] (few > every) Adjuncts are not subject to the long scope assignment that is characteristic of argument NPs in a language like English, and so the stacking order determines the interpretation: every > few for (1a), and few > every for (1c). By contrast, in (2) the understood order of the quantifiers is ambiguous. (2) John saw a friend of his every day. The simplest assumption is that again the stacking order determines the order of interpretation, but that the direct object in (2) is subject to wide scope assignment. So in QS scope is determined by stacking, but some items (NPs in argument positions) are subject to long scope assignment. Unlike quantification, topicalization seems to always be associated with leftward positioning of elements, not just in English, but generally across language types. We will assume that QS incorporates both of these facts, generating a set of structures that represent both topicalization and scope, around a head X. These structures have roughly the following form: (3)
The structures have a Topic segment and a non-Topic segment with obvious, if not well understood, interpretation; in addition, hierarchical relations determine relative scope. Surface structures are mapped into QS under the regime of Shape Conservation. Since the Topic segment of quantification structures is on the left edge, items on the left edge in SS will be mapped into them isomorphically. In English this will include subjects, and Topics derived by movement. (4) a. [XP* Topic segment b. John c. John
[XP* [ . . . ]]] non-Topic segment left early I saw yesterday
32
Chapter 2
This permits the Topic-like qualities of the subject position to assert themselves without any explicit movement to the subject position; the subject is mapped to one of the Topic positions in QS just as a moved Topic would be. The interpretation of focus is not at all straightforward. It is traditional to distinguish two kinds of focus, normal and contrastive. In Williams 1981a, 1997, I argued that they should not be distinguished. Here, and especially in chapter 9, I will in fact defend the distinction, but I will rationalize it as involving di¤erent RT levels. In this chapter I will use the distinction for expository, nontheoretical purposes. I will take normal focus to be reliably identified by what can be the answer to a question; thus, the Focus in (5B) is exactly that part of the answer that corresponds to the wh phrase in (5A). (5) A: What did George buy yesterday? B: George bought [a hammock]F yesterday. Contrastive focus, on the other hand, arises in ‘‘parallel’’ structures of the sort illustrated in (6). (6) John likes Mary and SHE likes HIM. It will be worthwhile to make this distinction because (a) some languages have di¤erent distributions for normal and contrastive focus, and (b) the terminology will be convenient for describing some of the interpretive e¤ects of scrambling discussed here. The Focus itself, in a language like English, is always a phrase bearing accent on its final position. In FS there seems to be a preference for the Focus to come at the end of the sentence; this is reflected in normal focus in Spanish, and in interpretive e¤ects for English scrambling (heavy NP shift). I conclude therefore that FS is characterized by final positioning of Focus. But apparently these directional properties of the English focus system are not fixed universally. Hungarian seems to exhibit the opposite scheme. It has a Focus position that appears at the left edge of the VP, just before the verb; all of the nontopicalized verbal constituents, including the subject, appear to the right. (7) Ja´nos E´va´t va´rta a mozi elo˝ tt. Janos.nom EVA.acc waited the cinema in-front-of ‘Janos waited for EVA in front of the cinema.’ (E´. Kiss 1995, 212)
Topic and Focus
33
In (7) E´va´t is focused, as it is the preverbal constituent; Ja´nos is topicalized. Hungarian FS thus has the following form: (8) Hungarian FS Topic Topic . . . Focus [V XP YP . . . ] (Furthermore, Hungarian Focuses are left accented, instead of right accented, perhaps an independent property.) In fact, the normal Focus is not always at the right periphery even in languages like English. In addition to rightward-positioned Focuses, particular XPs in particular constructions have the force of a Focus by virtue of the constructions themselves; examples in English are the cleft and pseudocleft constructions. (9) a. Cleft it was XPF that S [what S] is XPF b. Pseudocleft XPF is [what S]
It was John that Mary saw.
John is what Mary saw.
The XPs in such structures can be used to answer questions and so can be normal Focuses, or they can be contrastive Focuses (10a); furthermore, they are incompatible with being Topics (10b). There is thus strong reason to associate the pivots of these constructions with Focus. (10) a. What did John experience? What John experienced was humiliation. It was humiliation that John experienced. b. What did John experience? *It was John who experienced humiliation. *John is who experienced humiliation. I will simply include these structures in FS without speculating about why they do not have the Focus on the right or whether there is a single coherent ‘‘definition’’ of the structures in FS. I will postpone the latter issue until chapter 9, where I take up the general question of how levels determine interpretation. 2.3
Heavy NP Shift
With these preliminaries, I now proceed to an analysis of heavy NP shift (HNPS). I will argue that HNPS is not the result of movement, either to
34
Chapter 2
the left or to the right, but arises from mismapping CS onto SS. In particular, I will argue that Checking Theory does not analyze HNPS appropriately. That focus is implicated in HNPS is evident from the following paradigm: (11) a. John gave to Mary all of the money in the SATCHEL. b. *John gave to MARY all of the money in the satchel. c. John gave all of the money in the satchel to MARY. d. John gave all of the money in the SATCHEL to Mary. One could summarize (11) in this way: HNPS can take place to put the Focus at the end of the clause, but not to remove a Focus from the end of the clause—thus, (11b) is essentially ungrammatical. It is as though HNPS must take place only to aid and abet canonical FS representation, in which focused elements are final. (11d) shows that whatever HNPS is, it is optional. In sum, the neutral order (V NP PP) is valid regardless of whether the Focus is final or not, but the nonneutral order (V PP NP) is valid only if NP is the Focus. In fact, though, the situation is slightly more complicated, and much more interesting. In what follows I will refer to the direct object in the shifted sentences as the shifted NP, because in the classical analysis it is the moved element. The form in (11a) is valid not just when the Focus is the shifted NP, but in fact as long as the Focus is clause final in the shifted structure, whether or not the shifted NP is the Focus itself. It is valid both for Focuses smaller than the shifted NP and for Focuses larger than the shifted NP, as the following observations will establish. First, the licensing Focus can be a subpart of the shifted NP. (12) A: John gave all the money in some container to Mary. What container? B: (11a) John gave to Mary all of the money in the SATCHEL. In this case the Focus is satchel, smaller than the shifted NP. Second, the licensing Focus can be larger than, and include, the shifted NP; specifically, it can be the VP. (13) A: What did John do? B: (11a) John gave to Mary all of the money in the SATCHEL. In sum, HNPS is licensed if it puts the Focus at the end of the sentence (12), or if it allows Focus projection from the end of the sentence (13). It
Topic and Focus
35
thus feeds Focus projection; recall that Focus projection is nothing more than the definition of the internal accent pattern of the focused phrase itself, which in English must have a final accent. This constellation of properties is not well modeled by Checking Theory, including Checking Theories implementing remnant analyses. To apply these theories to the interaction of HNPS and focus would be first to identify a functional projection with a focus feature, then to endow the Focus of the clause with that same focus feature, and then to move the one to the other. Without remnant movement the result would be classical NP shift, a movement to the right. Remnant movement allows the possibility of simulating rightward movement with a pair of leftward movements. Suppose, for example, that NP in (14) is the Focus. (14) [V NPF PP] ! . . . NPF [V t PP] ! [V t PP] NPF t First the focused NP moves; then the remnant VP moves around it. The problem with both the remnant movement and the classical Checking Theory analyses is that the shifted NP is the Focus only in the special case, not in general. So it is hard to see why, for example, a structure like the one in (14) would be appropriate for VP focus—the movement of the NP would be groundless, as it is not the Focus. The correct generalization is the one stated: HNPS is licensed if it results in a canonical AS ‘ FS representation. This means that it results in the rightward shifting either of the focused constituent or of some phrase containing the focused constituent. So, for example, (11a) with VP focus has the following structure: (15) CS: [V NP PP] ‘! SS: [V PP NP] ‘ FS: [V PP NP]F In other words, the CS, SS mismatch (marked by ‘‘‘!’’) is tolerated because of the SS, FS match. In (11b), on the other hand, both CS ‘ SS and SS ‘ FS are mismatched. (16) CS: [V NP PP] ‘! SS: [V PP NP] ‘! FS: [V NP PPF ] This double misrepresentation is not tolerated in the face of alternatives with no misrepresentation. (In chapter 9 I will elaborate the theory of focus, as well as these representations, with a further relevant level (Accent Structure), but these changes will not a¤ect the structure of the explanations given here.) What this little system displays is an excessive lack of ‘‘greed,’’ to use Chomsky’s (1993) term: HNPS is licensed by a ‘‘global’’ property of the
36
Chapter 2
VP, not by the shifted NP’s needs. This is why it is di‰cult to model it with Checking Theory, because Checking Theory atomizes the movements and requires each to have separate motivation—interesting if correct, but apparently not. The remnant movement analysis is particularly bad: not only is the wrong thing moved (sometimes a subphrase, sometimes a superphrase of the target), but the ensuing remnant movement has no motivation either. Hungarian focusing shows the same lack of correspondence between displaced constituents and Focuses that English focusing does. Recall that Hungarian has Focus-initial FS structures; furthermore, the Focus itself is accented on the first word. (17) a. Ja´nos [a TEGNAPI cikkeket] olvasta . . . Janos the YESTERDAY’s articles read ‘Janos read YESTERDAY’s articles . . .’ b. . . . nem a maiakat. not the today’s ‘. . . not today’s.’ c. . . . nem a ko¨nyveket. not the books ‘. . . not the books.’ d. . . . nem a fu¨rdo˝ szoba´ban e´nekelt. not the bathroom-in sang ‘. . . not sang in the bathroom.’ (Kenesei 1998, as reported in Szendro˝ i 2001) The fronted constituent is bracketed in (17a). As (17c) shows, that constituent can be the Focus; but (17b) shows that the Focus can be smaller, and (17d) shows that it can be larger, including the verb. I have suppressed one further detail in connection with HNPS that is now worth bringing to light. (11b) is not, strictly speaking, ungrammatical. Rather, it has a very specialized use: it can be used ‘‘metalinguistically,’’ as in (18). (18) A: John gave to Joe all the money in the SATCHEL. B: No, John gave to MARY all the money in the satchel. That is, it can be used to correct someone. Rather than brushing these examples aside, I will show that their properties follow from the way in which phonological and syntactically defined focus are related to each other. But I will not do this until chapter 9, where I take up the notion of
Topic and Focus
37
the ‘‘values’’ that are defined at each level, and how the values of one level are related to the values of other levels. So, HNPS is analyzed here, not as a movement, but as a mismapping between CS and SS that is licensed by a proper mapping between SS and FS. As such, it should not show the telltale marks of real movement; that is, it should not leave phonologically detectable traces, it should intersect rather than nest with itself, and so on. Some of these behaviors are hard to demonstrate. However, there is one property of HNPS that has been put forward to show that it is a real movement: it can license parasitic gaps, and so is in fact a kind of A movement. (19) is the kind of sentence that is meant to support this idea. (19) John put in the satchel, and Sam t in the suitcase, all the money they found. The argument is based on the correct hypothesis that only ‘‘real’’ traces of movement can license parasitic gaps, but it wrongly assumes that HNPS is necessarily involved in the derivation of such examples. In fact, such examples can arise independently of HNPS, through the action of right node raising (RNR), a process not fully understood, but clearly needed in addition to HNPS. RNR, in the classical analysis, is an across-the-board application of a rightward movement rule in a coordinate structure, as illustrated in (20). (20) John wrote t, and Bill read t, that book. This analysis of RNR has been contested (see Wilder 1997; Kayne 1994), but not in a way that changes its role in the following discussion. Given such a rule, we would expect sentences like (19) even if there were no HNPS, so it cannot be cited to show that HNPS is a trace-leaving movement rule. We can understand (19) as arising from the across-the-board extraction of the NP [all the money they found ] from the two Ss that precede it, thereby not involving HNPS essentially (though of course the input structures to RNR could be shifted; it is hard to tell). (21) [John put t i in the satchel] and [Sam put t i in the suitcase] NPi Evidence that RNR is the correct rule for this construction comes from the fact that HNPS does not strand prepositions, combined with the observation that such stranded prepositions are indeed found in sentences analogous to (21).
38
Chapter 2
(22) a.
John talked to t i about money, and Bill harangued t i about politics, [all of the . . . ] i b. *John talked to t i about money [all of the . . . ] i
Although awkward, (22a) is dramatically better than (22b), and so HNPS is an unlikely source for sentences like (21). See Williams 1994b for further argument. Although the failure of HNPS to leave stranded prepositions is used as a diagnostic in the argument just given, it is actually a theoretically interesting detail in itself. If HNPS is a movement rule, and, I suppose, especially if it is a leftward parasitic-gap-licensing movement, as it is in the remnant movement analyses of it, then why does it not strand prepositions, as other such rules do? In the RT account, HNPS arises in the mismatch between SS and CS: the same items occur, but in di¤erent arrangement, so stranding cannot arise, as stranding creates two constituents ([P t] and NP) where there was one, in turn creating intolerable mismatch between levels. 2.4
Variation
Some levels are in representation relations with more than one other level, giving rise to the possibility that conflicting representational demands will be made on one and the same level. An item in SS, for example, must be congruent to a Case structure and to a quantification structure, and these might make incompatible demands on the form of SS. Since mismatches are allowed in the first place, the only question is whether there is a systematic way to resolve these conflicts. I will suggest that languages di¤er with respect to which representation relations are favored. This arrangement is somewhat like Optimality Theory (OT), if we identify the notion ‘‘shape-conserving representation relation’’ with ‘‘faithfulness.’’ But RT and OT di¤er in certain ways. In RT only competing representation relations can be ranked, and they can be ranked only among themselves and only where they compete on a single level. Intralevel constraints are simply parts of the grammar of each independent sublanguage, and so cannot be ranked with the representation relations those sublanguages enter into. In this regard RT is more restrictive than OT. On the other hand, I will be assuming that the properties of the sublanguages themselves are open to language-particular variation; and in this respect RT is less restrictive than OT, as OT seeks to account for
Topic and Focus
39
all language particularity through reordering of a homogeneous set of constraints. RT also resembles theories about how grammatical relations (subject, object, etc.) are realized in syntactic material. For example, LexicalFunctional Grammar (LFG; Kaplan and Bresnan 1982) posits two levels of representation, f-structure and c-structure. F-structure corresponds most closely to the level called TS here, and c-structure corresponds most closely to everything else. An algorithm matches up c-structures and f-structures by generating f-descriptions, which are constraints on what c-structures can represent a given f-structure. Since the overall e¤ect is to achieve a kind of isomorphism between c-structures and f-structures, the grammatical system in LFG bears an architectural similarity to the RT model, especially at the ‘‘low’’ (TS) end of the model, even though there is no level in RT explicitly devoted to grammatical relations themselves, that work being divided among other levels. Similar remarks apply to the analysis of grammatical relations presented in Marantz 1984. LFG di¤ers from RT in several ways. First, the matching between c-structure and f-structure is not an economy principle, so the notion ‘‘closest match’’ plays no role. The LFG f-description algorithm tends to enforce isomorphism, but its exact relation to isomorphism is an accidental consequence of the particulars of how it is formulated. By comparison, in RT exact isomorphism is the ‘‘goal’’ of the relations that hold between successive levels, and deviations from exact isomorphism occur only when, and to the exact degree to which, that goal cannot be achieved. Second, LFG posits only two levels, whereas RT extends the matching to a substantially larger number of representations, in order to maximize the work of the economy principle. Third, and most important, the place of embedding in the two systems is di¤erent. I will propose in chapter 3 that embedding takes place at every level, in the sense that complements and adjuncts are embedded in later levels that have no correspondents in previous levels. In LFG, if embedding is done anywhere, it is done everywhere; that is, if a clause is present in c-structure, it has an f-structure image. Thus, the predictions of RT made and tested in chapters 3–6 are not available in LFG. 2.5
English versus German Scrambling
Let us now turn to a systematic analysis of the di¤erence between English and German in terms of mismapping between levels. Keeping in mind the
40
Chapter 2
rough characterization of QS and FS given above, we may now characterize that di¤erence as follows: (23) a. German: SS c QS > SS c CS b. English: SS c CS > SS c QS c. Universal: SS ‘ FS That is, in German SS representation of QS is more important than SS representation of CS (signified by ‘‘>’’); in English the reverse is true. And in all languages of course FS represents SS. Let us now examine what expectations about German will flow from the specifications in (23). Perhaps arbitrarily, I identify the following four: 1. Two definite NPs in German should not be reorderable, apart from focus. 2. Definite pronouns move leftward. 3. A definite NP obligatorily moves leftward over (only indefinite) adverbs. 4. Surface order disambiguates quantification, except where Q is focused. Expectation 1: First, two definite NPs in German should not be reorderable, unless a special focusing is to be achieved. This is true because SS must represent CS, unless that requirement is countervailed by some other representational need. (24) Two definites are not reorderable with normal focus a. IO DO V (CS order) b. *DO IO V This conclusion is true, and in fact is a commonplace of the literature on the German middlefield; see De´prez 1989 for a summary account. Expectation 2: Definite pronouns appear on the left edge in SS, as required by QS (¼ TopS), since they are always D-linked—again, a commonplace of the literature. Expectation 3: A definite NP will move leftward over an adverb, in defiance of CS, in order for SS to match QS, as definites always have wider scope than indefinite adverbs; see the end of this section for a discussion of the behavior induced by definite adverbs, based on findings of Van Riemsdijk (1996). But the pull to the left to move the direct object into the clause-initial Topic field of QS can be countervailed by the need to place narrow focus on the object, as in (25b), which makes leaving the NP after the adverb an option, even though the NP is D-linked. The key
Topic and Focus
41
here is to understand that an NP can be both focused and D-linked, in e¤ect both focused and topicalized, and that both of these properties are needed to understand the German middlefield behavior. The following cases show that these expectations are fulfilled: (25) Definites move left, except if narrowly focused a. weil ich die Katze selten streichle because I the cat seldom pet ‘because I seldom pet the cat’ b. ?*weil ich selten die Katze streichle (good only if contrastive focus on Katze (Diesing 1992) or [Katze streichle] (M. Noonan, personal communication)) c. weil ich die KATZE selten streichle (only narrow focus on KATZE ) d. What did Karl do? Den HUND hat Karl geschlagen. the dog has Karl beaten ‘Karl beat the DOG.’ (Prinzhorn 1998) In passing, note the di‰culty this sort of example poses for a remnant movement analysis of topicalization, or for rightward movement. The problem, in both cases, is that the verb stays at the end, no matter what. If we assume SVO order (as remnant movement theories generally do for SOV languages), then to derive (25b) where the object is focused, we must perform the operations of focusing and remnant movement, resulting in something like one of the two following derivations: (26) a. weil ich weil ich weil ich weil ich b. weil ich weil ich weil ich weil ich
selten streichle die Katze die Katze [selten t streichle] [selten streichle] die Katze selten die Katze streichle selten streichle die Katze selten die Katze [streichle t] die Katze [selten streichle t] selten die Katze streichle
! topicalization ! remnant movement ! ?? derive SOV order ! derive SOV order ! topicalization ! ?? remnant movement
The last step is the puzzler—how to get the verb in final position again, but at the same time end up with the adverb before the direct object. The operations otherwise motivated, including the remnant movement half of focusing, do not seem to have the properties needed to achieve this.
42
Chapter 2
In German, scrambling is more or less obligatory to disambiguate the scope of coarguments, so there is much less surface quantifier ambiguity in German than in English. This is because German favors QS representation over CS. But again, there is an important exception: when the second of the two NPs is narrowly focused, it can remain in situ and be scopally ambiguous there. The important thing here is that the possibility of wide scope in the rightmost position is dependent on narrow focus. Despite the other di¤erences between the two languages, German behaves identically to English in this respect, mimicking the special contours of the HNPS construction discussed in section 2.3, mutatis mutandis: in German FS countervails QS representation, whereas in English HNPS it countervails CS representation. Importantly, German does not require that the rightmost NP be the Focus itself; rather, it must be a part of a narrow Focus, as (25b) shows. This detail precisely matches the case of English HNPS. It would appear that the ‘‘global’’ property of having a canonical FS representation overrides the German-particular requirement that SS be a canonical QS representation. Expectation 4: The notion that QS is the level in which both Topics and quantifiers get their scopes is supported by the fact that scope interpretation interacts with focusing in exactly the same way that Topics do, as the following examples establish: (27) Movement disambiguates quantified NPs a. ~&dass eine Sopranistin jedes Schubertlied gesungen that a soprano every Schubert song sung hat (eine > jedes) has ‘that a soprano sang every song by Schubert’ b. ~&dass jedes Schubertlied eine Sopranistin gesungen hat ( jedes > eine) (Diesing 1992) (28) ‘‘Unmoved’’ NP is ambiguous if and only if narrowly focused a. &Er hat ein paar Mal das la¨ngste Buch gelesen. he has a couple times the longest book read ‘He read the longest book a couple of times.’ b. ~&Er hat das la¨ngste Buch ein paar Mal gelesen. Example (28) in particular shows that a wide scope quantifier can be left in situ exactly in case it is narrowly focused.
Topic and Focus
43
In remnant movement Checking Theories (28a) would need to be represented as follows: (29) a. Assign (i.e., check) scope er hat das la¨ngste Buch [ein paar Mal [t gelesen]] b. Assign (i.e., check) Focus er hat [ein paar Malj [das la¨ngste Buchi [t j [t i gelesen]]]] Ein paar Mal must move precisely because das la¨ngste Buch is the Focus, and thus not for reasons of its own. The di‰culty is increased, just as it was in the case of HNPS, by the fact that the same word order and scope interpretation are possible if the whole VP [das la¨ngste Buch gelesen] is narrowly focused. In other words, not only does narrow focus in a quantified NP permit in-situ positioning, but so does canonical Focus projection from that NP. Again, this is exactly the behavior found earlier for HNPS in English. Although I do not have relevant examples, I would expect the same results in (27) and (28) if the Focus was a subconstituent of the direct object (e.g., contrastive focus on the noun Buch), again by parallelism with the HNPS facts. The overall relation of focus to topic in German can be summarized in the following cascade of exceptions: (30) NP must be in Case position except if D-linked or wide scoped except if narrowly focused or part of a canonical narrow Focus. RT derives this cascade from the competition of congruences that SS must enter into. In English, SS does not represent QS, but rather CS; thus, quantifier ambiguities abound. (31) He has read the longest book a couple of times. Example (31) is ambiguous even if the whole sentence is the Focus (as it would be, for example, in answer to the question, What happened?). The two readings have the following structures: (32) a. CS ‘ SS !c QS (narrow scope for the longest book) b. CS ‘ SS c QS (wide scope for the longest book) By the logic of RT, (32a) is tolerated, in the face of (32b), because (32a) gives a meaning that (32b) does not.
44
Chapter 2
But it is not enough for a misrepresentation (or in classical terms, a movement) to serve some purpose—it matters which purpose. For example, HNPS is not justified simply to achieve QS ‘ SS representation, as (33) shows. (33) *John gave to every FRIEND of mine a book. (E > b) Rather, HNPS is justified only to achieve FS c SS congruence, as established earlier. Although it is conceivable that a language could work the other way (since in fact German does), English does not. It does not because it rates CS representation over QS representation tout court. In the main line of work within the ramified Pollock-style theory of clause structure, the leftward positioning of topicalized NPs is achieved by movement—that is, by the same kind of relation that wh movement is. Evidence of movement comes from viewing the di¤erent positions an NP can occupy under di¤erent interpretations, where positions are identified with respect to adverb positions. This methodology has been thoroughly explored in a variety of languages. Van Riemsdijk (1996) has pointed out the following problem with this methodology. In German the adverbs themselves seem subject to the same dislocating forces as the NPs; that is, definite adverbs such as dort ‘there’ move leftward, compared with their indefinite counterparts such as irgendwo ‘somewhere’, as the following paradigm illustrates: (34) a.
Ich habe irgendwem/dem Typ irgendwas/das Buch versprochen. I have someone/that guy something/the book promised ‘I promised someone/that guy something/the book.’ b. *Ich habe irgendwas dem Typ versprochen. c. Ich habe das Buch dem Typ versprochen.
(35) a.
Sie hat irgendwo/dort wen/den Typ aufgegabelt. she has somewhere/there someone/that guy picked up ‘She picked someone/that guy up somewhere/there.’ b. ??Sie hat irgendwo den Typ aufgegabelt. c. Sie hat dort den Typ aufgegabelt. (Van Riemsdijk 1996)
Example (34) shows the relative ordering properties for a definite and an indefinite NP, and (35) shows the same thing for an adverb and an NP: a definite NP is bad after an indefinite adverb, but OK after a definite adverb. This finding calls into serious question whether adverbs can be used as a frame of reference against which to measure the movement of NPs. It
Topic and Focus
45
also calls into question the notion that adverbs occupy fixed positions in functional structure determined solely by what they are understood to be modifying. And it suggests that everything, including adverbs, is moving in the same wind, or rather the same two countervailing winds of QS (¼ TS) and FS. 2.6
Hungarian Scope
Brody and Szabolcsi (2000) (B&S) present Hungarian cases just like the German cases observed by Noonan and others cited earlier. That is, moved quantifiers are unambiguous in scope, while unmoved ones are ambiguous; but not moving has consequences for focus. According to standard analyses since E´. Kiss 1987, Hungarian quantified NPs (including the subject) are generated postverbally and then moved to the left of the verb; leftward movement fixes scope. There are two types of position to the left of the verb: a single Focus position immediately to the left of the verb, and then a series of ‘‘Topic’’ positions to the left of that, giving the following structure: (36) [NPT NPT . . . NPF V . . . ] To illustrate: (37a) is not ambiguous, but (37b) is ambiguous. This is because in (37a) both NPs have moved, so their relative scope is fixed; but in (37b) minden filmet has not moved, so it is scopally ambiguous. (37) a. ~&Minden filmet keve´s ember ne´zett meg. (every > few) every film few people saw prt b. &Keve´s ember ne´zett meg minden filmet. few people saw prt every film (B&S 2000, 8) But B&S have provided a more fine-grained version of the facts. They report that the accent pattern of the sentence disambiguates (37b); in particular, if minden filmet is accented, then it has wide scope over keve´s ember. (38) a. Keve´s ember ne´zett meg MINDEN FILMET. (every > few) b. Keve´s ember ne´zett meg minden filmet. (few > every) This is now a familiar pattern, the same one we have seen in German and English; but how it arises in Hungarian remains to be spelled out, and this requires a few remarks about the Hungarian FS and QS levels.
46
Chapter 2
Like English, Hungarian allows multiple Focuses, and only one of them can occupy the designated Focus position to the left of the verb. Secondary Focuses can be located to the right of the verb; they cannot occupy the positions to the left of the primary Focus, as these are Topic positions. Thus, the Hungarian FS looks like this: (39) Hungarian FS [ . . . F V . . . (F) (F)] The initial Focus position is the ‘‘normal’’ position for a single Focus; in particular, it is the position from which Focus ‘‘projects’’ in Hungarian. The postverbal Focus positions, if a sentence has any, are strictly narrow, nonprojecting Focus positions. From these remarks, we can see that the RT analysis of Hungarian is essentially the same as that of German: in particular, SS c QS > SS c CS (i.e., SS representation of QS dominates SS representation of CS). However, as in German, FS representation can tip the balance back. Apart from considerations of focus, in order for minden filmet to have wide scope, it would need to appear in preposed position, as it does in (37a). From the fact that preposing fixes relative scope among the preposed elements, we can conclude that Hungarian QS has the following structure: (40) Hungarian QS [QPi [QPj V . . . ]], where QPi has scope over QPj And from the fact that apart from special focusing considerations, preposing of quantified NPs is essentially obligatory, we can again conclude that SS representation of QS dominates SS representation of CS. We can see the two requirements of (39) and (40) interacting exactly in the case of a wide scope focused quantified NP. If there is a single Focus, it must occur in the single preverbal canonical Focus position, to satisfy Focus representation. Such representation will also fix its scope. But if there are two Focuses, only one can appear preverbally. The other must appear postverbally, for the reason already discussed. The following problem then arises. Suppose the second Focus is to have wide scope, the situation of minden filmet in (38a). A case like this has the following representational configuration:
Topic and Focus
47
(41)
As is clear, QS is misrepresented by SS. Ordinarily, this would not be tolerated, but in this special circumstance SS representation of FS compensates. If, on the other hand, minden filmet is not a Focus, as in (38b), then it must move in order to take wide scope; the reason is that the match with FS will not be improved by not moving, whereas the match with QS will be. In other words, for (38b) the following three structures will be in competition with each other: (42)
Leaving CS representation aside, (42b) and (42c) are clearly superior to (42a), as (42a) has a misrepresentation of QS. But (42b) and (42c) represent di¤erent meanings: (42b) has wide scope for QPi and (42c) for QPj . (42c) and (42a) are competing for representation of wide scope for QPj , and (42c) wins. The result is that (42b) must be the representation for (38b) where the second quantifier minden filmet is unmoved, and so it must have narrow scope. The di¤erence that focus makes is that (42c) is
48
Chapter 2
not a viable candidate to represent focus on the second NP, and so (42a) wins unopposed. By similar reasoning, we can explain why two preverbal QPs have fixed scope. I will assume that neither is focused, so QS representation is all that is at stake. The canonical mapping gives the surface order (43b), so the question is why the noncanonical mapping in (43a) is barred. (43)
It turns out that (43a) is blocked by an alternative surface order, which represents the scope order perfectly. (44) SS: QPj
QPi
!
!
QS: QPi
QPj
V V
Thus, these Hungarian cases pattern just like the German cases considered earlier. Given this parallelism, one would expect parallels to the cases in German in which the apparently ‘‘moved’’ phrases are not the focused phrases themselves, but projections of the focused phrases, or subparts of the focused phrases. I do not know the pertinent facts. B&S give a di¤erent analysis of the ambiguity of (37b). In their view, on the wide scope reading for minden filmet it has the structure in (45), and the reason minden filmet has wide scope is that it is structurally higher than the subject þ V. (45) [Keve´s ember ne´zett meg] minden filmet. The problem this analysis raises is of course that the subject þ V is not a natural constituent. However, in the framework adopted by B&S it is: it arises in a derivation in which both NPs are preposed to a position in front of the verb. (46) minden filmet [keve´s ember [ne´zett meg t t]]FP The traditional Hungarian Focus position is the position immediately preceding the verb; accepting this traditional account, B&S call the con-
Topic and Focus
49
stituent consisting of the VP and the first-to-the-left NP FP. Then this entire FP is itself preposed, giving the structure in (47). (47) [Keve´s ember [ne´zett meg t t]]FP [minden filmet t]. That is, the derivation proceeds by hyper-Kaynian remnant movement. There are some special problems here for analyses that use remnant movement. The first is that such analyses cannot be applied to German, for reasons given in the preceding section, nor can it be applied to English HNPS, also for reasons already given—essentially, the two-way failure of correspondence between the Focus and the moved constituent. But there is a problem peculiar to Hungarian itself. The remnant movement of the subjectþV is actually a movement of the entire FP, which consists of the entire VP and the focused constituent that immediately precedes it. So, one would expect any phrase that was a part of the VP to show up to the left of the in-situ QP; but in fact, such phrases (video´n ‘on videotape’, in the following example) can appear either before or after that QP, (48) a. Keve´s ember ne´zett meg video´n minden filmet. b. Keve´s ember ne´zett meg minden filmet video´n. and the scope of minden filmet in both cases can be construed as wide (B. Ugrozdi, personal communication). Example (48a) is compatible with all theories, but (48b) is mysterious for B&S’s account, as it must have the following structure: (49) [Keve´s ember [ne´zett meg]VP ] minden filmet tFP video´n. Somehow video´n has escaped the VP (and FP), to the right. Pursuing the logic of radical remnant movement, we might assign this example the following structure, in which the apparent rightward movement of video´n is really the result of its leftward movement, plus radical leftward remnant movement: (50) a. [keve´s ember minden filmet video´n [ne´zett meg t t t]] ! b. [ne´zett meg t t t] [keve´s ember [minden filmet [video´n t . . . But the problem with this is that there should be no space between minden filmet, which is focused, and the verb, as the Focus must always precede the verb directly. The general character of the problem that Hungarian poses for checking theories of focus and topic is no di¤erent from what we have seen for other languages: Checking Theory armed with triggering features for
50
Chapter 2
focus and topicalization will wipe out any trace of Case and theta structure: once a remnant movement has taken place, all trace of Case and theta structures is invisibly buried in entirely emptied constituents. This consequence of remnant movement does not seem to hold empirically. 2.7
Spanish Focus
We have adopted the ‘‘answer to a question’’ test for identifying normal focus. English allows normal focus anywhere, not just on the right edge, as the constitution of FS would lead us to expect. (51) A: Who did John give the books to t? B: John gave MARY the books. This can be taken to show that English allows FS to be misrepresented by SS, sacrificed in this case for accurate CS representation. (52)
Spanish, on the other hand, does not seem to permit nonfinal normal Focuses—at least, not as answers to questions. (53) A: Who called? B: *JUAN llamo´ por tele´fono. JUAN called (Zubizarreta 1998) B 0 : Llamo´ por tele´fono JUAN. (54) Spanish FS ‘ SS > . . . The logic of this chapter suggests that Spanish di¤ers from other languages in favoring FS ‘ SS representation over all others. The fact that Spanish has a subject-postposing rule (as illustrated in (53B 0 )) aids it in meeting this requirement, though RT does not causally connect the ungrammaticality of (53B) with the presence of the postposing rule. One reason for making no such connection is that other languages with subject postposing (specifically Italian; see (55)) permit both (53B) and (53B 0 ).
Topic and Focus
51
The ungrammaticality of (53B) follows directly. A related but di¤erent approach to the problem would be to allow Spanish to have the same FS as English, and to block (53B) by (53B 0 )—that is, to say that the mere availability of (53B 0 ) is enough to ensure that (53B) is blocked. I think this is the wrong approach in general. First, there are languages like Italian, where the analogues of both (53B) and (53B 0 ) are grammatical. (55) A: Who called? B: GIANNI ha urlato. GIANNI has called 0 B : Ha urlato GIANNI. (Samek-Lodovici 1996) Second, even in a language like English, which lacks subject postposing, we can create cases where the same logic would apply, blocking completely grammatical answer patterns like (56B). (56) B: I gave the SATCHEL to Mary. B 0 : I gave to Mary the SATCHEL. Clearly, the alternative order in (56B 0 ) does not compete with the order in (56B), or at least it does not win. In German and English we saw that focus considerations can countervail requirements of scope assignment. In Spanish we would expect focus considerations to override requirements of scope assignment. That is, we should find cases where NPs are obligatorily mis-scoped in surface structure because of overriding focus requirements. I do not have the relevant facts at the moment. There is one methodological obstacle to getting relevant facts: we have identified normal focus with answerhood, but answers to questions generally take wide scope. This is not to say that Spanish lacks any sort of Focus non-phrasefinally—it lacks only the kind of Focus that is needed for answering questions. Zubizarreta (1998, 76) gives the following example: (57) JUAN llamo´ por tele´fono (no PEDRO). JUAN called not PEDRO Here a phrase-initial accented NP can serve as a contrastive Focus—just where it cannot serve as a Focus for the purpose of answering questions. In chapter 9 I will embed a theory of contrastive versus normal focus in a theory of the values assigned at each level: FS will be the input to question interpretation, but Accent Structure (a level to be introduced in
52
Chapter 2
chapter 9), which normally ‘‘represents’’ FS by matching an accented phrase to a focused phrase at FS, will be shown to give special metalinguistic e¤ects when FS is not canonically represented, as in (57). What happens in Spanish when a normal Focus cannot be postposed, for some reason intrinsic to the structural (i.e., CS- or SS-related) restrictions in the language? It is not clear, as it is di‰cult to form a question in Spanish where the question word is nonfinal, because postposing and reordering always seem to permit postposing. Nevertheless, small clause constructions might be relevant cases. Con quie´n llegaron enferma? with who arrived sick ‘Whoi did they arrive with sicki ?’ B: Llegaron con MARIA enferma. 0 B : *Llegaron con enferma MARIA. B 00 : *Llegaron enferma con Maria. (J. Camacho, personal communication)
(58) A:
As the translation indicates, the PP con MARIA modifies the verb, and the adjective enferma (with feminine ending) modifies Maria and so enters into some kind of secondary predication relation with it. That predication relation does not permit postposing, of either Maria or the PP con Maria. In that case the normal Focus can be nonfinal, as in (58B). This shows that Spanish does permit nonfinal normal Focuses, but only when it has no choice. What does it mean to have no choice? In RT it must mean one of two things. First, it could mean that the representing level simply has no form that corresponds to [V PP AP], the form of the VP in (58B 0 ). Second, it could mean that SS, in addition to representing FS, must also represent some other structure, presumably the one in which small clause predication is adjudicated, and that the call to represent that structure is stronger than the call to represent FS. As I have no considerations favoring one over the other, I will let the question stand. 2.8
Russian Subjects
Russian exhibits the same behavior we found in German scrambling and English HNPS: obligatory leftward positioning of elements unless they are narrowly focused.
Topic and Focus
53
Usˇi zalozˇilo. ears.acc.pl clogged-up.neut.sg (Lavine 1997) b. *Zalozˇilo usˇi. (unless usˇi is narrowly focused) (S. Harves, personal communication)
(59) a.
The only argument to zalozˇilo is the accusatively marked internal argument usˇi; one would normally expect it to appear postverbally, as other such internal arguments would. But in fact that is not the normal order for such sentences; rather, the order in which the argument occurs preverbally is the normal order. It is normal in the sense that it is the only order, for example, in which Focus projects, and so the only focus-neutral order. The di¤erence between German and Russian lies in the freedom with which arguments can cross the verb. Nothing like Holmberg’s generalization holds in Russian. There are two ways to account for this state of a¤airs. I will outline them, without choosing between them. The first possibility, the simpler of the two, is that Russian FS imposes the NP V order, in that such a structure is the only one from which Russian permits Focus projection. In other words, Russian FS has the following structures, among others (where ‘‘ 0 ’’ marks accented positions). (60) Russian FS a. [NP 0 V NP 00 ]F b. [NP 0 V]F c. [V NPF0 ] The pattern in (60b) is in fact the pattern for Focus projection in English intransitive sentences. (61) a. One of my friends 0 died. b. One of my friends died 0 . If the main accent is on died, as in (61b), then died also bears narrow focus; but if it is on friends, as in (61a), then it can project to the entire sentence. Under this regime the derivation of (59a,b) would look like this: (62) a. CS: [zalozˇilo usˇi] ‘ SS: [zalozˇilo usˇi] !c FS: [usˇi zalozˇilo]F b. CS: [zalozˇilo usˇi] ‘ SS: [zalozˇilo usˇi] c FS: [zalozˇilo usˇiF ]
54
Chapter 2
In this scheme Russian has a notion of subject at FS in the sense that only structures with a preverbal NP allow projection. But the requirement that there be a subject could arise somewhat earlier, so long as it did not arise as early as CS, or wherever nominative Case is assigned, because it clearly has nothing to do with nominative Case. Suppose, for concreteness, that there is an SS requirement that there be a subject, which must be met even if there is no nominative. Given such a requirement, surface structures would have to have the following form, where the first NP is the ‘‘subject’’: (63) SS: [NP V . . . XP] In that case the structures assigned to (59a) would look like this: (64) CS: [zalozˇilo usˇi] ‘! SS: [usˇi zalozˇilo] c FS: [usˇi zalozˇilo]F That is, SS misrepresents CS, but faithfully represents the Focusprojecting FS. Because constraints within levels are inviolable, the surface structure for (59b) must be the same as the surface structure for (59a); but then, the ‘‘heard’’ output is wrong, since the form of (59b) is Zalozˇilo usˇi. In order to model the facts, there must be a ‘‘heard’’ representation that is subsequent to SS; suppose that FS is such a representation. Then, FS will (mis)represent SS, rather than the reverse, and the following derivation is possible: (65) CS: [zalozˇilo usˇi] ‘! SS: [usˇi zalozˇilo] ‘! FS: [zalozˇilo usˇiF ] Here SS misrepresents CS, as it must in order to meet the SS subject requirement; in addition, FS misrepresents SS, presumably in order to achieve narrow focus on usˇi. In later chapters I will adopt two elements from this analysis: (a) that FS represents SS, rather than the reverse, and (b) that di¤erent levels have di¤erent ‘‘subjects.’’ The notion ‘‘subject’’ could be a feature of many di¤erent levels, but with predictably di¤ering properties, if the properties depend on the properties of the levels themselves. Shape Conservation will tie the subjects together: in the canonical mapping between levels, subject at one level will map to subject at the next. See section 3.2.2 for a generalization of the notion ‘‘subject’’ across the levels. In the second account SS has a notion of subject that motivates the first mismapping. This notion of subject is completely analogous to the Extended Projection Principle (EPP), since it is understood as a requirement
Topic and Focus
55
distinct from Case assignment in minimalism. See Lavine (forthcoming) for extended argument for this arrangement in a minimalist account of Russian. What Russian adds to the picture developed here is the fact that complement and verb can reorder in mismapping. In Germanic and Romance any reordering of complement and head is associated with Case, and the evidence for separating the EPP from Case has come largely from expletive constructions. Lavine’s work establishes that the phenomenon is a good deal more general. I will return to Russian impersonal verbs in chapter 5, after necessary notions about the RT levels are introduced in chapter 3. At present we have no means to weigh the relative cost of mismapping that respects head order and mismapping that does not. In Williams 1994b, in a di¤erent theoretical context, I proposed the principle TRAC, which suggested that reordering (for scrambling) was compelled to maintain the theta role assignment configuration, which among other things specified the directionality of theta role assignment; but clearly this is not generally true. Still, although I have no concrete suggestion to o¤er at this point, I am tempted to think that reorderings that violate TRAC are more costly than reorderings that do not. 2.9
Conclusion
2.9.1 Semantics of Form The facts pertaining to the interaction of scrambling, topic, and focus provide a rich testing ground for theories attempting to account for correlations between syntactic form and meaning. Checking Theory provides a simple account, interesting if correct because it assumes a straightforward compositional semantics: interpretable features are interpreted in situ, accounting for meaning, and they act as syntactic attractors, accounting for form. But for the constructions examined here, this account does not seem to work; instead, what we find is a holistic matching of a clause structure with a Case structure on the one hand and a quantification structure on the other, without the possibility of reducing the interrelations involved in the match to a set of triggered movement relations. This is because the possibility of mismatching two structures depends crucially on what other structures exist, and because the ‘‘moved’’ constituent does not correspond to the constituent on which the interpretation turns.
56
Chapter 2
Perhaps the most radical conclusion that can be drawn from this is that semantics is not compositional in a significant sense: the quantification structure of a clause is fixed holistically, by matching a surface structure with an independently generated quantification structure, and how that match works is determined by what other matching relations the surface structure enters into. To this extent, the quantification and focus structures of a sentence are not determined by a strictly compositional computation. If this conclusion is accepted, then we must account for why semantics appears to be compositional. I think we can best understand this by considering the question, when would a pattern-matching theory of semantics be fully indistinguishable from a compositional semantics? The answer is, when every possible attempt to match succeeded—when for any given quantification structure there was a surface structure that fully matched a Case structure and a focus structure, so that full isomorphism held across the board. In that case we could use either theory interchangeably; the result would always be the same. If the conclusion of this chapter is correct, English and German approximate this state, but neither achieves it, and in fact they deviate from it in di¤erent ways. The approximation is close enough that if only a narrow range of facts is examined in any one analysis, the failure of compositionality will escape detection. Given substantive conclusions about the nature of each of the sublanguages, it is probably inevitable that a completely isomorphic system would be impossible. 2.9.2 How Many Levels? How many levels are there? In this chapter I suggested four or five (CS, TS, SS, QS, FS). At di¤erent points in what follows, I will talk about models with di¤erent numbers of levels. What is the right number? If we had the right number, and the properties of each, we would pretty much have a complete theory. I have nothing like that. What I have instead is evidence for a number of implicational relations of the sort, ‘‘If property A occurs in level X and property B occurs in later/earlier level Y, then it follows that . . .’’; and in fact the discussion in this chapter has had exactly this character. These implicational predictions exploit the main idea without requiring a full theory, and seem su‰ciently rich to me to encourage further investigation into what might be viewed as a family of representation theories.
Topic and Focus
57
Every theory—or more properly, every theoretical enterprise—has at least one open-ended aspect to it. For example, di¤erent Checking Theories propose di¤erent numbers of functional elements and di¤erent numbers of features distributed among them. It is no trivial matter to determine whether some group of checking analyses, and the Checking Theories that lie behind them, are compatible with one another, and consequently whether there is a prospect of a final Checking Theory that is compatible with all of those analyses. What makes them all Checking Theories is that they all have the same view of the design plan of syntax: they all incorporate some notion of movement governed by locality or economy that results in checked features, which are used up. The same is true of representation theories. In chapter 4 I introduce a new level, Predicate Structure. The reason for the new level is that the levels determined by the considerations in chapters 1–3 do not allow enough distinctions. In introducing the new level, I assume, basically without demonstration, that it is compatible with the results of the previous chapters. In chapter 9 I introduce a new kind of level, Accent Structure, for focus. Again, I do so because the levels proposed earlier do not allow enough distinctions, and I hope that the newly extended theory is at least compatible with the results of this chapter. One can see repeating itself here the history of the development of Checking Theories. Many journal articles are devoted simply to achieving some descriptive goal by splitting some functional element into finer structure. Much the same can be said of OT. There, the content of the constraints themselves is not fixed, nor is the architecture (division into modules) of the linguistic system. So the number of ‘‘Optimality’’ Theories is enormous and varied, but we are still justified in calling them Optimality Theories if they hew to the basic tenets: the calculus for evaluating candidate structures against a set of constraints, and the notion that all variation reduces to constraint ordering. In like manner, I would reserve the term Representation Theory for any theory that posits multiple syntactic levels in a shape-conserving relation to one another, whatever the levels turn out to be. To that, I would like to add one other substantive hypothesis, the Level Embedding Conjecture of chapter 3, if for no other reason than I feel that the most interesting predictions follow from the model that incorporates that idea. A number of things can be inferred about this class of theories, things that are independent of various decisions about what the levels are.
58
Chapter 2
The correct RT will have no fewer levels than are envisioned in this chapter. Can we see enough of how the methodology works to gain some rough idea about what the final model might look like? I think the limiting case is an RT with exactly the same number of levels as there are functional elements in the structure of a clause in the corresponding Checking Theory. That is, it would not have a ‘‘Case Structure’’; rather, it would have an ‘‘Accusative Structure’’ and a ‘‘Dative Structure.’’ Likewise, it would not have a Theta Structure; rather, it would have a Patient Structure and an Agent Structure. I think this limiting case is not correct, because there appear to be functional subgroupings of these notions: patient and theme seem to be part of a system with certain properties, as do accusative and dative. But even if this limiting case turned out to be correct, RT would not thereby become a notational variant of Checking Theory, because the architecture is di¤erent, and the architecture makes predictions that Checking Theory is intrinsically incapable of. I turn to those predictions in the next chapter.
Chapter 3 Embedding
In the preceding chapters the levels of RT have been used to account for word order facts of a certain sort: mismapping between levels has been invoked as a means of achieving marked word orders with certain interpretive e¤ects. In this chapter I will sketch other properties of the levels and indicate how certain high-level syntactic generalizations might be derived from the architecture of the model in a way that I think is unavailable in other theoretical frameworks. I will consider two kinds of embedding here, complement embedding and functional embedding, and I will treat them very di¤erently. Suppose we accept the notion that there is a fixed hierarchy of functional elements (T, Agr, etc.) that compose clause structure (and similar sets for other phrase types). Functional embedding is then the embedding that takes place within one fixed chain of such elements—embedding AgrO under T, for example. Complement embedding is the embedding that takes place between two such chains—embedding NP or CP under V, for example. In this chapter I suggest that complement embedding takes place at every level, with di¤erent complement types entering at di¤erent levels. The result is an explanation of the range of clause union e¤ects and a derivation of a generalized version of the Ban on Improper Movement. The methodology is pursued further in chapters 4 and 5, resulting in what I call the LRT correlations: for any syntactic process, three of its properties will inevitably covary, namely, its locality, its reconstructive behavior, and its target (e.g., A or A position). These properties are tied together by what level they apply at, and in particular by what complement types are defined there. In chapter 4 I show that anaphors are ‘‘indexable’’ in this way by level, with predictably varying properties across the levels. English himself, for example, is a CS anaphor, whereas Japanese zibun is an SS anaphor; ideally, all properties are determined by those assignments,
60
Chapter 3
and earlier anaphors ‘‘block’’ later ones by general principle (the Level Blocking Principle). In chapter 5 I do the same for scrambling rules. The predictions bound up in these correlations rely on the feature of RT that does not translate into minimalism or other theories, namely, the decomposition of clause structure into distinct sublevels or sublanguages. In chapter 7, turning to functional embedding, I propose an axiomatization of X-bar theory that reduces head-to-head movement to X-bar theory, accounting for its locality and especially for its restriction to a single clause structure. In chapter 8 I take up the morphological consequences of this account. In RT a lexical item is understood as ‘‘lexicalizing’’ or ‘‘representing’’ a subsequence of functional structure. 3.1
The Asymmetry of Representation
Before turning to complement embedding, I need to make a point about representation that is entailed by the account I will give. Representation will necessarily be an asymmetric relation in the model that embraces the results of this chapter, for reasons having to do with how embedding is accomplished. By hypothesis, all levels are involved in embedding (the Level Embedding Conjecture; see section 3.2.1). Functional elements are themselves associated with particular levels. Tense, for example, is not defined before SS, and so enters structures there at the earliest. Consequently, there will be representation relations that systematically violate isomorphism. For example: (1) TS: [agent [V theme]] ‘ CS: [NPnom [VT NPacc ]]T There is at least one element in CS—namely, the T(ense) marking—that is absent from TS in (1); hence, there is not a two-way one-to-one mapping between the two sets of structures. Despite the lack of isomorphism, such relations will count as completely true mappings, not mismappings. The reason is that the representation relation itself will have an asymmetric definition. To take TS ‘ CS as a special case, true representation will have the following properties: (2) a. Every item in TS maps to an item in CS. b. Every significant relation between items in TS maps to a relation in CS (for relations like ‘‘head of ’’). Importantly, (2) does not impose the reverse requirements: that every item in CS be mapped to an item in TS, and so on. If (2) defines repre-
Embedding
61
sentation, then representation is not really isomorphism, but homomorphism, and so is asymmetric. A homomorphism is like an isomorphism in being structure preserving and therefore reversible; but the reverse is not defined for the full range. Representation must be asymmetric if new lexical or functional material enters at each level, as the hypotheses to be entertained in this chapter will require. The Case structure in (1) includes more than the theta structure (T, in particular), but it can still be said to represent the theta structure in (1), if (2) is true. Under this view the mismappings described in chapter 2 are now to be viewed as deviations from homomorphism, rather than from isomorphism. No kind of embedding is immune. Adjuncts will also enter clause structure at later levels, perhaps at all levels. Wh movement itself is not defined until SS, presumably also the level where CP structure is defined (or, where it takes IP), and so any adjuncts that are themselves CPs (such as when, where, and why clauses) involving wh movement cannot enter until that point either. Let us look at a concrete example involving adjuncts. (3) is a fully valid representation relation; the tree on the right obviously has more in it, but that doesn’t matter if all the items and relations in the first tree have correspondents in the second. (3)
(4) Preserved relations V head of VP NP1 subject of VP NP2 object of V NP1 left of VP The new item, the adverb, and the new relations it enters into with the rest of the sentence do not interfere with the representation relation. In what follows I will speak of the representation relation as holding sometimes between two levels or sublanguages, sometimes between two members (or trees) of those levels or sublanguages, and even sometimes
62
Chapter 3
between subparts of trees in di¤erent levels. It is of course the fact that the representation relation preserves the structure of one level in the structure of the next level that makes it possible to slip from one to another of these usages. Wh movement takes place within the SS level, in the following way. A structure in CS is mapped into a very similar structure in SS; wh movement derives another structure within SS; and that structure (at least in languages with overt wh movement) is then mapped to a structure in FS. (5)
As in previous chapters, the wavy arrow (‘) marks a representation relation, and now the straight arrow marks an intralevel derivational relation. So the structure has ‘‘grown’’ a SpecC in SS. In e¤ect, the Case structure is mapped isomorphically to a subpart of the surface structure that carries forward (backward?) from there. Exactly how the functional elements, ‘‘real’’ movement rules, and so on, sort out into levels remains to be fixed empirically. But in advance of that, this chapter lays out a theory that says that all the important properties of the items will in turn be fixed by that choice. Some processes, elements, and such, may be defined at more than one level. For those cases, two of which are anaphors and scrambling rules, the model has further consequences: blocking holds between levels, so ‘‘early’’ elements always block ‘‘late’’ elements (see Williams 1997 for further discussion). It should be clear that there is a relation between the levels of RT and the layers of functional structure in standard Checking Theories. The asymmetry noted above is fully consistent with this. Later levels of RT correspond to higher layers in functional structure. In particular, later levels have ‘‘bigger’’ structures than earlier levels: I will suggest below that CP exists in SS, for example, but only IP exists in some earlier structure (CS or PS). For some considerations, it will be simple to translate between RT and Checking Theories, because of the ‘‘higher equals later’’ correspondence that holds between them. I will naturally dwell on those considerations for which there appears to be no easy translation from RT to Checking Theory in order to e‰ciently assess the di¤erences between them.
Embedding
3.2
63
Complement Embedding and the Level Embedding Conjecture
I will suggest in this section that each of the RT levels defines a di¤erent complement type and that all complement types are embeddable. The complement types range from the very ‘‘small’’ clauses at TS to the very ‘‘large’’ clauses at FS. The range of complement types corresponds to the degree of clause union that the embedding involves: TS complements are very tight clause union complements (like serial verb constructions), whereas FS complements are syntactically isolated from the clause they are embedded into. This di¤erence follows immediately from the model itself: RT automatically defines a range of types of embedding complements, one type defined at each level, as summarized in (6). (6) Types of embedding TS objects: serial verb constructions CS objects: exceptional Case marking; control? SS objects: transparent that clause embedding FS objects: nonbridge verb embedding
(VPs?) (IPs) (CPs) (big CPs)
On the right I have indicated the category in standard theory to which the objects defined at each level correspond. This correspondence cannot be taken literally as a statement about what objects are defined in each level of RT, because di¤erent RT levels define di¤erent types of objects altogether. For example, TS does not define VPs; rather, it defines theta structures, which consist of a predicate and its arguments. Nevertheless, the objects in the RT level of TS correspond most closely to the VPs of standard theory, and so on for the rest of the levels in (6). This aspect of embedding is a ramified ‘‘small clause’’ theory, with small, medium, large, and extra large as available sizes. In a strict sense, the structures ‘‘grow’’ from left to right, theta structures being the smallest and focus structures the largest. 3.2.1 The Level Embedding Conjecture There are thus many types of embeddable complements under a ramified small clause theory, but where does embedding take place? One way to treat complement embedding in RT would be to do all embedding at TS. Complex theta structures would be mapped forward into complex Case structures, and so on; and higher clause types would then be ‘‘recycled’’ back through TS for complement embedding, as the diagram in (7) indicates.
64
Chapter 3
(7)
This arrangement would make RT most resemble minimalist practice and its antecedents. I think though that much can be gained by a di¤erent scheme: the one already alluded to, in which di¤erent kinds of embedding are done at di¤erent levels. As there seem to be di¤erent ‘‘degrees’’ or ‘‘types’’ of embedding with respect to how isolated from one another the matrix and embedded clauses are, we might gain some insight into them by associating the di¤erent types with di¤erent levels in RT. I will refer to this theory of embedding as the Level Embedding Conjecture (LEC). In RT the LEC is in a way the simplest answer to the question of how embedding is done: it says that an item can be embedded exactly at the level at which it is defined, and no other. (8)
For example, the tightest clause union e¤ects can be achieved by embedding one theta structure into another in TS, deriving a complex theta structure, which is then mapped into a simple Case structure. The behavior of such embedding is dominated by the fact that there are too many theta roles for the number of Cases, so some kind of sharing or Case shifting must take place. A good example of this is serial verb constructions, where two theta role assigners (i.e., verbs) must typically share a single Case-marked direct object, and where there must be a tight semantic relation between the two. At the other extreme, that clause embedding takes place much later, in SS for example. What does a derivation involving that clause embedding look like? Two clauses (matrix and embedded) both need to be derived to the level of SS, at which point one is embedded in the other. (9) TS: CS: SS: [Bill, [believes]] ‘ [Bill, [believes]] ‘ [Bill [believes]] þ [Mary, [ate a dog]] ‘ [Mary [ate a dog]] ‘ [Mary [ate a dog]] ! [Bill [believes [Mary [ate a dog]]]] The verb believe is subcategorized to take an SS complement. This subcategorization is always taken to determine not only the type of the complement, but also the level in which the embedding takes place; it is
Embedding
65
this double determination that generates the broad consequences alluded to at the beginning of this chapter, and detailed below. Before we turn to the details of embedding at di¤erent levels, a word about the notion ‘‘lexical item’’ in RT. Lexical items obviously participate in multiple representations. Ordinarily the entries in the lexicon are regarded as triples of phonological, syntactic, and semantic information. In RT lexical items are n-tuples of TS, CS, . . . , and phonological information. For example, the theta role assigner squander, which assigns a theme and an agent role in TS, is related to the Case assigner squander, which assigns accusative Case in CS; to the surface verb squander with its properties, whatever they are; and so on. (10) squander
TS: [agent [squander theme]] CS: [squander accusative] SS: . . . ...
Part of the algorithm that computes isomorphism between levels clearly takes into account identity of lexical items across di¤erent levels; thus, (11a) and (11b) will count as isomorphic, but (11c) and (11d) will not. (11) a. [agent [squander [theme]]] ‘ c. [agent [squander [theme]]] ‘!
b. [nominative [squander accusative]] d. [nominative [squash accusative]]
Lexical entries such as (10) are the basis for such identities. The rest of this chapter assumes something like this conception of the lexicon, actually just the obvious elaboration of the usual assumption. 3.2.1.1 TS Embedding As mentioned above, the lowest level of embedding is associated with the strongest clause union e¤ects, since a complex theta structure is represented by a simple Case structure. Consider for example the following serial verb constructions from Dagaare (12a) and ¼ j Hoan (12b): (12) a. o da mOng la saao de bing bare ko ma 3sg past stir factive food take put leave give me (Bodomo 1998, (32)) b. ma aqkhu j’o djo ki kx’u na 1sg prog pour put.in water part pot in ‘I am pouring water into a pot.’ (Collins 2001)
66
Chapter 3
In the serial verb construction the clause contains several verbs, each thematically related in some way to at least some of the objects. Significantly, there is a single direct object, and a single indirect object. We can view this as a combination of two theta structures, followed by a subsequent representation by a single Case structure. (13) TS: {V1 {V2 {V3 {V1
theme} þ ‘ theme} þ theme, goal} ¼ V2 V3 theme goal}
CS: [VCase
assigner
NP NP]
In other words, three simple theta structures, one for each V, are combined into a complex theta structure, and that is mapped onto a simple ditransitive Case structure. It is typically remarked in connection with such constructions that the connection between the verbs is extremely tight semantically, so tight that the verbs can only be understood as denoting subparts of a single event. If so, we might suppose that events are defined in TS, hence that complex events are derived there. The ‘‘þ’’ in (13), then, is a complex-eventderiving operator with a limited range of possible meanings, and only these are available for serial verb constructions. The possible meanings include ‘causes’, ‘occurs as a part of the same event’, and so on. Such remarks are reminiscent of what is often said about ‘‘lexical’’ causatives: that the notion of causation is extremely direct, causing and caused events constituting a single complex event. For example, (14a,b) are not synonymous. (14) a. John encoded the information. b. John brought it about that the information got encoded. (14b) holds of a much wider set of situations than (14a). (14a) covers only the case where John performed an action that resulted in the encoding without other mediating events or other agents. In fact, (14b) might tend to exclude the meaning that (14a) has, but this is most likely due to blocking (i.e., for the situations for which (14a) and (14b) are both applicable, (14a) is preferred, because it is more specific than (14b)). As we have hypothesized that morphology has access only to TS, and to nothing higher, it is not surprising that lexical causatives are restricted to the ‘‘single complex event’’ interpretation, since that is the only interpretation available at TS, a fact we know independently from serial verb constructions.
Embedding
67
There is a more complex situation that arises in serial verb constructions: each of the verbs has Case-assigning properties. The second verb is sometimes felt to be ‘‘preposition-like.’’ These might be analyzed as a complex theta structure mapping into a complex Case structure, where the complex Case structure has two Case assigners, V and P. I will leave the matter for further work. Other examples of TS embedding might include tight causative constructions. The causative in Romance involves Case shifting (nom ! acc, acc ! dat) that can be understood as arising from the need to accommodate a complex theta structure in a simple Case frame. (15) Jean a fait þ Pierre manger la pomme ! Jean a fait manger la pomme a` Pierre. Jean made eat the apple.acc to Pierre.dat ‘Jean made Pierre eat the apple.’ The complex predicate constructions studied in Neeleman 1994 are further potential examples. We could characterize embedding in TS as embedding that shows obvious apparent violations of the Theta Criterion—two or more verbs assign the same theta role to the same NP, without the mediation of PRO or trace. The reason this embedding does not respect the Theta Criterion is that the Theta Criterion itself does not hold in TS; rather, it holds of the way that theta structures are mapped to Case structures. 3.2.1.2 CS Embedding CS embedding conforms strictly to the Theta Criterion, but may exhibit Case interrelatedness between two clauses. Exceptional Case-marking (ECM) constructions might well be good instances of CS embedding. Case is not really shared between the two clauses in these constructions; rather, the matrix V has Case influence in the embedded clause. With regard to event structure, there is no ‘‘single event’’ interpretation, as the two verbs are part of the designation of different events. (16) John believes himself to have won the race. Furthermore, although the embedded clause in (16) is transparent to Case assignment by the verb of the matrix clause, the sentence clearly has two Case assignment domains, and in fact in (16) two accusative Cases have been assigned. Thus, ECM is di¤erent from TS embedding. (17) CS: [John believes] þ [himself to have won the raceacc ] ¼ John believes himselfacc to have won the raceacc
68
Chapter 3
English provides some minimal pairs illustrating the di¤erence between CS and TS embedding. Expletives do not exist in TS, where every relation is a pure theta relation. Expletives exist to fill Case positions that do not have arguments in TS mapped to them. Given this, we might wish to analyze certain small clause constructions as CS embeddings and others as TS embeddings, depending on whether an expletive is involved or not. English has two constructions that might di¤er in just this way: most small clause constructions require an expletive in the direct object position when the subject of the small clause is itself a clause, but a few do not. (18) a. I want to make *(it) obvious that Bill was wrong. b. I want to make (very) clear that Bill was wrong. For a handful of adjectives like clear and certain, the verb make does not require an expletive; and as the adverb very in (18b) indicates, the reason is not simply that make-clear is an idiosyncratic compound verb. If we suppose that expletives do not enter until CS, we could assign (18a,b) the following structures, respectively: (19) a. TS: [make clear]VP CS: [make clear]V that S CS: [make it clear . . . ]VP b. TS: [make]V c. *How clear did he make that he was leaving? d. How clear did he make it that he was leaving? Make-clear is a complex predicate formed in TS, analogous to causative constructions of the kind found in Romance, where, incidentally, expletives are also excluded (Kayne 1975). Expletives then mark ‘‘formal’’ non-TS Case positions, that is, positions with no correspondent in TS. It is likely that ‘‘Case’’ itself is not a single notion; in particular, it is likely that so-called inherent Case is present in TS, and only derivatively in CS. CS then would introduce only formal Cases, not inherent or semantic Cases. Evidence for this would come from compounding: as we have restricted compounding to representing TS, only inherent Case should show up in compounding. Although I have not investigated the matter in detail, this does conform to my general impression. In the case of make clear, the TS phrase [make clear]VP is mapped to the CS atom [make clear]V . That it is truly atomic can be seen in the contrast between (19c) and (19d): make clear does not allow the extraction of clear, but make it clear does. In previous work (Williams 1998a) I attributed this to the di¤erence between a lexical formation (make clear)
Embedding
69
and a phrasal formation (make it clear), along with a principle stipulating the atomicity of lexical units in phrasal syntax. RT allows a relativized notion of atomicity: if a phrase at one level corresponds to, or is (mis)mapped to, an atom at the next level, that atom will be frozen for all processes subsequent to that level. An advantage of this conception is that it does not force us to call make clear a word in the narrow sense, a designation discouraged by its left-headedness and by its modifiability (make very clear). The relativization involved here—relativizing the notion of atomicity to hold between every pair of adjacent levels—will become a familiar notion in chapter 4 and subsequently. 3.2.1.3 SS and FS Embedding Embedding at SS is ordinary that clause embedding. Case cannot be shared across the that clause boundary (but see Kayne 1981) because Case is already fully assigned by the time the that clause is embedded in its matrix. (20) CS: SS: I think ‘ I think þ he is sick ‘ he is sick ¼ I think that he is sick If wh occurs in SS, as I have assumed, then embedding in FS should be out of the reach of wh movement; that is, complements embedded in FS should be absolute islands with respect to FS embedding. What sort of embeddings would be expected in FS? Presumably, embeddings in which it would be reasonable to attribute a focus structure to the complement. Since focus is generally a root ‘‘utterance’’ feature, the embedded clauses that are focus structures would be those that most closely match matrix utterances in their semantics and properties. From this perspective, it would be reasonable to expect ‘‘utterance’’ verbs like exclaimed and yelled to embed focus structures. These verbs embed not just propositions, but ‘‘speech acts,’’ loosely speaking, as the verbs qualify the manner of the act itself. This is the class of verbs traditionally identified as nonbridge verbs, so called because their complements resist extraction. (21) *Who did John exclaim that he had seen t? To the extent that this is so, then the assignment of this kind of embedding to FS derives the behavior of these verbs with respect to wh extraction. (22) SS (wh movement): ‘ FS (too late for wh movement): [John exclaimed] þ John exclaimed [he saw who] [he saw who]
70
Chapter 3
In the case of nonbridge verbs, the parts are simply not put together in time for extraction, hence their islandhood. In fact, though, they should not be absolute islands, but islands only to pre-FS movement. If a movement is defined for FS, these verbs should act like bridge verbs for that movement. In order to guarantee that embedding is delayed until FS, the lexical entry for nonbridge verbs must be endowed with subcategorization for FS objects, which is in keeping with their meaning, as mentioned earlier. It is reported that some languages (e.g., Russian) resist wh extraction from all tensed clauses. Perhaps in such a language, all tensed-clause embedding takes place at FS. The derivation of the islandhood of nonbridge verb complements is an example of a kind of explanation natural to RT. I will refer to such explanations as timing explanations. 3.2.1.4 Countercyclic Derivation The LEC forces some rather unexpected derivations. The matrix may develop a very complex structure itself before the lowest embedded clause is actually embedded into it. For example, consider a sentence in which an ECM infinitive is embedded in a matrix that clause, and another that clause is embedded under the verb in the ECM clause. (23) a. [that . . . [him to have said [that . . . ]]ECM ] b. He believes him to have said that he was leaving. The LEC actually requires that the ECM construction be embedded in its matrix before the that clause is embedded under the verb in the ECM clause, so for this kind of case the order of embedding is ‘‘countercyclic.’’ This is of course because under the LEC, ECM embedding takes place in CS, and that clause embedding takes place in SS, so the derivation looks like this: (24)
Embedding
71
Similarly, it could happen that a verb taking a that complement is embedded under a matrix raising verb before its own complement clause is added. (25) TS: ... [seems þ sad]
SS: seems [sad that Bill is leaving]
The reason for thinking that raising embedding takes place in TS is that it is found in compound formations. (26) a. sad seeming b. odd appearing We have seen reason to restrict compounds to levels that are representations of TS; but then since raising constructions can appear as compounds, raising must be a TS relation, and so the order of derivation in (25) follows. I do believe that it is entirely harmless that derivations proceed this way. I wish it were more than this; countercyclic embedding is a distinctive feature of RT, so that one should be able to exploit it to find empirical di¤erences with other theories, none of which have this property. Still, I have not been able to find any such di¤erences. It is important to emphasize that the LEC ensures an orderly assemblage of multiclause structure, just as much as the incremental application of Merge in minimalist practice; it simply gives a di¤erent order. Embeddings take place in the order of complement type, rather than in bottomto-top order. 3.2.2 Consequences of the LEC To sum up the consequences of the LEC, one might say that it forces or suggests generalizations of fundamental elements of linguistic structure: generalized A/A distinction, subjecthood, generalized anaphoric binding, generalized scrambling. The dimension of generalization is always across the RT levels. The first two are taken up in the remainder of this section, the last two in chapters 4 and 5. 3.2.2.1 The Relational Nature of Improper Movement The LEC derives the Ban on Improper Movement (BOIM) directly. In fact, it derives a generalization of it that is distinctive to RT. The BOIM is generally taken to block movement from A positions to A positions, as in (27), in which John moves, in its last step, from SpecC of the lower clause to SpecI in the higher clause.
72
Chapter 3
(27) *John seems [t [Bill has seen t]]CP . I will take it as given that the BOIM is real. I will suggest how it can be generalized in RT, and how it can be derived from the basic architecture of the model in a way that is not possible in standard minimalist practice or its antecedents. The generalization of the BOIM to the Generalized BOIM (GBOIM) is nothing more than the generalization of the A/A distinction that we will see in this chapter and in chapters 4 and 5. I will state the GBOIM as it would occur if it were instantiated in a standard model, one with a ramified Pollock/Cinque-style clause structure. (28) The GBOIM Given a Pollock/Cinque-style clausal structure X1 > > Xn (where Xi takes Xiþ1 P as its complement), a movement operation that spans a matrix and an embedded clause cannot move an element from Xj in the embedded clause to Xi in the matrix, where i < j. In RT, as we will see shortly, the GBOIM follows from the architecture of the theory and therefore needs no independent statement. The GBOIM is a proper generalization of the BOIM to the extent that A positions are beneath A positions in clausal architecture as a special case; in general, according to the BOIM, if you are on the second floor of clause A, and you move into clause B, you can’t move to a floor any lower than the second. Since we will generalize the A/A distinction in RT to the relation between any pair of levels, and since there will be no A/A distinction apart from this, the BOIM ! GBOIM generalization is forced in the present theoretical context. In this generalized version, items in Case positions in an embedded clause, for example, cannot move into theta positions in the matrix, and so forth. However, items in theta positions can move to higher theta positions, higher Case positions, and so on. The GBOIM is not obviously true, and a number of existing analyses run counter to it, to the extent that it regiments A and A positions as special cases. For example, any analysis in which clitic movement is A movement is contrary to the BOIM, if the subject position is an A position superior to the clitic position. Analyses of this sort must be reexamined in light of the GBOIM. Some are taken up below, though most will remain unaddressed.
Embedding
73
The BOIM itself is not derivable in minimalist practice from the basic principles governing derivation, such as economy or extension (the strict cycle). For example, at the point at which wh movement would violate the BOIM, a minimalist analysis would have built up a structure like (29a), and neither economy nor the strict cycle nor extension prevents the application of wh movement to derive (29b) by putting the wh in SpecV (or SpecI, for that matter). (29) a. [V [wh . . . ]CP ]V 0 b. [wh [V [t . . . ]CP ]V 0 ]V 0 This is not to say that there cannot be principles that block particular cases of the BOIM (the GBOIM is in fact such a principle); my limited point is that it does not follow organically from basic assumptions about derivation or economy. But I believe the GBOIM does follow unavoidably from the basic architecture of RT, or something like it, so long as the LEC is a part of it. The RT levels determine di¤erent kinds of embedding, as described in the previous sections. To make the discussion concrete, assume that SS is the level at which ‘‘transparent’’ that clause embedding takes place. Di¤erent levels are also associated with di¤erent kinds of movement; again, for the sake of concreteness, let’s assume that SS is the level at which wh movement takes place and CP structure is introduced. Proper movement takes place in derivations with the following character: first, two surface structures are built up by building up all of the structures smaller than (read, ‘‘earlier than’’) these structures. Then the two surface structures are combined, and finally movement takes place. (30)
The GBOIM follows from the RT architecture in this way. The earliest that wh movement can take place is after the embedding in SS. However, at that point, not only has the embedded clause been built up to the level of SS, but so has the matrix clause; thus, there is no analogue of (29a) for wh movement to apply to. When wh movement applies in SS, since the surface structure it applies to already has a CP structure, extension (or something like it) requires that it operate in such a way as to move the wh item to the periphery of that surface structure. It will thus always move the wh item to SpecC, since that position is introduced in SS.
74
Chapter 3
For improper movement to take place, the matrix would have to have peripheral positions ‘‘lower’’ than the highest position in the embedded clause. However, that possibility is excluded by the LEC, which says that embedding can take place only among elements of the same type, because each level defines a di¤erent type. (31), repeated here from (29a), is therefore not a possible structure in RT with the LEC. (31) [V [wh . . . ]CP ]V 0 The problem in deriving the GBOIM in a theory in which (31) is a wellformed syntactic object is that the matrix and embedded clauses are in di¤erent degrees of development. The embedded clause is fully developed to the level CP, but the matrix is only partially developed, so there is no level at which it can embed this CP and thereby derive the improper movement in (29b). Of course, the matrix itself can be developed to the level CP, but then the embedding will occur in SS, and extension, or some equivalent, will force movement to the top of the matrix CP, respecting the BOIM. It is this di¤erence in development of matrix and embedded structures that gives rise to the problem of improper movement. In RT, since embedding is always of objects at the same level, no such di¤erence arises and improper movement is therefore impossible. RT crucially needs some notion of extension to prevent trivial defeat of the most interesting predictions of the LEC. These trivial defeats correspond to what in the standard model would be violations of the strict cycle if it were applied in a phrase-by-phrase manner, as suggested in Williams 1974. I will assume that extension, essentially as it is used in Chomsky 1995, has to be part of the intended interpretation of RT as well: any operation has to a¤ect material that could not have been affected in a previous level. The parallelism with the standard interpretation is clear: simply replace level with cycle, where every node is ‘‘cyclic.’’ Without something like extension there is no good reason why movement in SS would have to be to the periphery of the CP structure defined there, and not, for example, to SpecIP. In general, extension requires that the periphery be a¤ected by an operation. There are in fact some problems with the literal notion of extension, which I will take up later. Two immediate empirical consequences of the GBOIM are worth noting here. First, ‘‘raising to object position’’ as a movement rule is impossible, since it is a movement from a higher (subject) position in the embedded clause to a lower (object) position in the matrix clause. If the arguments
Embedding
75
(in, e.g., Postal 1974 or Lasnik 1999) for raising to object in ECM constructions are correct, then the analysis involving (improper) movement must now be replaced by an analysis in which mismapping the TS ‘ CS representation accounts for the facts. Only ‘‘real’’ (intralevel) movement is governed by extension. The more di‰cult problem is tough movement. I think the widely accepted misanalysis of tough movement as involving movement to matrix subject position has obstructed progress in syntax at several points in the past 40 years, and so deserves close attention. According to the standard analysis, tough movement actually seems to involve a pair of movements: first, wh movement to SpecC, and second, a (BOIM-violating) movement from SpecC of the lower clause to SpecI of the higher. (32) John i is tough t i to please t i . Of course, the di‰culty can be solved by simply generating John in the top position in the first place, eliminating the second movement. But that implies that John receives a theta role from tough, and what has always stood in the way of that conclusion is the synonymy of (32) with (33). (33) It is tough to please John. Call (32) the object form, and call (33) the event form (because (32) has the ‘‘object’’ John as its subject, and (33) has the event to please John as its subject (extraposed)). The main argument for tough movement, then, is the synonymy of the event form and the object form of these sentences. But this synonymy could be misleading. One component of the synonymy is the perception that selection restraints on John in the two sentences not only are similar, but seem to emanate wholly from the lower predicate ( please), and not at all from the higher predicate (tough). But that perception may be illusory. It may be that a class of predicates (easy, tough, etc.) takes such a broad class of arguments, including both events and objects in general, that it is hard to detect selection restraints; in e¤ect, anything can be easy, for example. In some cases there is an obvious sense in which a thing can be easy. (34) The test/contest/chore/task/errand/puzzle was easy. At least for such cases, it must be admitted that easy takes a single nominal argument as subject. For other cases it is less obvious what it means to apply the term easy.
76
Chapter 3
(35) The book/store/bank/rock/tower/dog was easy. For such cases, though, either the context will determine in what way the thing is easy, or the way it is easy can be specified in an adjunct clause. (36) The book was easy [to read/write/clean/hide]. But if this view is correct, we are taking the object form to have the following properties: easy takes the object as its (thematic) subject, and the clause after easy is an adjunct. We then must conclude that the tough sentences are at least ambiguous, between this and the usual BOIMviolating derivation; but now perhaps we can eliminate the latter derivation, as redundant. In fact, there is good reason to. First, there are structures just like (36) whose object and event forms are not synonymous, or even equivalent in terms of grammaticality. (37) a. Mary is pretty to look at. b. *It is pretty to look at Mary. So we know we need structures of the type suggested by (36) anyway. The ungrammaticality of (37b) follows simply from the fact that pretty cannot take an event as an argument, but easy can. Second, there are structures synonymous with (35) that cannot be derived by movement. Consider (38a–f ), where (38a) parallels the sentences in (35). (38) a. John is good. b. John is good to talk to. c. It is good to talk to John. d. John is good for conversation. e. John is a good person to talk to t. f. *It is a good person to talk to John. Good acts like a tough predicate in (38a–c), showing the synonymy of object and event forms. However, (38d), though roughly synonymous with (38c), could not conceivably be derived from it. The same is true of (38e), as (38f ) shows. So we need to generate the object form directly, with the object getting its primary theta role from the tough predicate, and getting its relation to the embedded predicate only indirectly, as the embedded predicate is an adjunct to the tough predicate.
Embedding
77
The adjunct status of the embedded clause is further shown by its optionality (see (39a)); in true cases where a matrix subject gets its theta role from an embedded predicate, the embedded predicate is not optional (see (39b)). (39) a. John is easy. b. *John seems. But so far I have not explained one of the salient facts about the construction that supports the movement relation I am trying to ban: namely, that the matrix subject (e.g., of (36)) is interpreted as the object of the embedded verb. Since in my analysis the matrix subject gets its theta role from the matrix predicate, and the embedded clause is an adjunct clause, it does not immediately follow that the subject will be interpreted as identical to the embedded object. Clearly, some mechanism must interpret the matrix subject as ‘‘controlling’’ the embedded object position, or more precisely, the operator chain in the adjunct clause that includes the object gap. I have nothing to contribute to that topic here; for my purposes it is enough to observe that several diverse constructions require such a mechanism as a part of their description; the pretty to look at construction in (37) is one such case, and (40) illustrates two more. (40) a. John bought it [to look at t]. (purpose clause) b. John is too big [to lift t]. (too/enough complement) In each of these the embedded operator chain is linked to a matrix argument—object in (40a) and subject in (40b). As there is no chance that movement could establish that link for these cases, I will stick with my conclusion about the tough cases: the matrix subject gets a simple theta role from the tough predicate; the embedded clause is an adjunct with an operator chain, which is interpretively linked to the matrix subject. If this analysis of the tough construction is correct, then a major obstacle to the (G)BOIM is eliminated, and this I think is in fact the most compelling reason to accept that analysis. The LEC rules out more than the (G)BOIM. It also rules out, for example, any relation between two subject positions if CP structure intervenes. M. Prinzhorn (personal communication) points out that it automatically rules out superraising. (41) a. *John seems [that t saw Bill]. b. *John seems [that Bill saw t]. c. *John seems [that it was seen t].
78
Chapter 3
Not all of (41a–c) count as pure superraising cases in all theories, but in fact they are all ruled out by the LEC: once any CP structure is present in the embedded clause, it is present by hypothesis in the matrix clause, and so, by extension, it is too late to execute any subject-to-subject relations. H.-M. Ga¨rtner (personal communication) provides more cases that are relevant for the GBOIM, and hence for the LEC—namely, the following intriguing examples from German: Wen i glaubst du [t i0 dass Maria t i sieht]? that Maria.nom sees who believe you ‘Who do you believe that Maria sees?’ b. Wen i glaubst du [t i0 sieht Maria t i ]? c. Ich frage mich [wen i du glaubst [ti0 dass Maria t i sieht]]. I wonder who you believe that Maria.nom sees ‘I wonder who you believe that Maria sees.’ d. *Ich frage mich [wen i du glaubst [t i0 sieht Maria t i ]]. (H.-M. Ga¨rtner, personal communication)
(42) a.
Schematically: (43) a. [wh V [twh ]Vfinal ]V2 b. [wh V [twh ]V2 ]V2 c. . . . [wh V [twh ]Vfinal ]Vfinal d. *. . . [wh V [twh ]V2 ]Vfinal The clear generalization is that it is possible to extract into a V2 (verbsecond) clause from either a V2 or a Vfinal (verb-final) clause, but it is possible to extract into a Vfinal clause only from a Vfinal clause. This is a very odd fact. Clearly, V2 clauses are not themselves islands, as (43b) shows; islandhood is determined not just by where the extracted element is coming from, but also by where it is going. This is the sort of fact that barriers were designed for (Chomsky 1982). But I will instead develop a ‘‘timing’’ explanation in terms of the LEC. It will be a little like the account of nonbridge verb embedding: specifically, it will be based on the supposition that V2 clauses are ‘‘bigger’’ (and therefore ‘‘later’’) than Vfinal clauses. The supposition takes some plausibility from the fact that V2 clauses are most often matrix clauses. We might imagine that matrix clauses have more functional structure than embedded clauses—functional structure associated with ‘‘speech act’’ aspects of an utterance (this is the ‘‘performative’’ syntax that harks back to Ross 1970).
Embedding
79
(44) [[[ . . . ] . . . ]FVfinal . . . ]F 0 F 0 here is the extra functional structure that triggers V2; FVfinal structure is strictly smaller. Furthermore, and in fact as a consequence of being ‘‘bigger,’’ V2 clauses will be later than Vfinal clauses in RT. For concreteness, I will assume that V2 structures are defined in FS, whereas Vfinal structures are defined in SS, where SS ‘ FS. In this setup wh movement will have to take place at two di¤erent levels, since the cases we are looking at have embedded wh and matrix wh. Matrix wh is in FS, and embedded wh is in SS. We might imagine that FS wh is fed by embedded wh; that is, in terms of the structure in (44), wh moves to SpecFVfinal in SS, and from there to SpecF 0 . (45) [wh [t [ . . . t . . . ] . . . ]FVfinal . . . ]F 0 The second movement might not be a movement, but part of the SS ‘ FS representation. However, I will ignore that possibility here as it plays no role in the explanation of Ga¨rtner’s paradigm. As is well known, some German verbs embed V2 complements, which in present terms means that they embed FS clauses at the level FS. If these V2 complements are indirect questions, they will involve FS wh movement to SpecF 0 , as well as V2, which itself is presumably triggered by F 0 . So such embedded questions are completely parallel to matrix questions in their syntax and relation to the levels. The diagrams in (43) can now be annotated with the clausal structure postulated in (44), to give the following structures: (46) a. [wh b. [wh c. [wh d. *[wh
V [t [ . . . t . . . ]]FVfinal ]F 0 V [t [ . . . t . . . ]]F 0 ]F 0 V [t [ . . . t . . . ]]FVfinal ]FVfinal [V [t [ . . . t . . . ]]F 0 ]]FVfinal
Only the final movement of the wh in each case is of interest here. Given that F 0 > FVfinal in the functional hierarchy, only in (46d) is that final movement a GBOIM-violating ‘‘downgrading,’’ from F 0 to FVFinal ; all the other final movements are either upgradings (46a) or movements that maintain the functional level of the trace (46b,c). Hence, Ga¨rtner’s paradigm follows from the GBOIM. I will conclude this section by pointing out a case that is a counterexample to the LEC so long as it relies on the completely literal notion of extension: the French L-tous construction, illustrated here:
80
Chapter 3
(47) a. Marie a toutes i voulu [les manger t i ]. Marie has all wanted them to-eat ‘Marie wanted to eat them all.’ b. Il a tous i fallu [qu’ils parlent]. it has all needed that they speak ‘It was necessary that they all speak.’ c. Il a tous i fallu [que Louis les lise t i ]. it has all needed that Louis them read ‘It was necessary that Louis read them all.’ In each of these the tous in the matrix modifies the embedded direct object, suggesting it has been moved from there. The problem, as noted by J.-Y. Pollock (personal communication), is that the tous seems to violate extension under the LEC. Tous is located to the right of the matrix subject, but seems to have been moved out of an embedded clause that is ‘‘bigger’’ (in terms of functional structure) than the phrase to which it has attached. This is especially apparent in cases like (47c): tous has moved out of an embedded that clause, but still has moved to a position short of the subject in the matrix. The LEC with extension would not allow this: if the embedded clause is a CP, then so is the matrix, and extension would dictate no movement except to the edge of that CP. I can imagine two sorts of answer. First, although tous movement can span clauses, the clauses must be infinitival, or, as in (47b,c), subjunctive. Infinitival clauses are smaller (and therefore earlier) than full CPs; perhaps subjunctive clauses are also smaller and earlier, despite the presence of que. The other sort of answer requires a reformulation of extension. I have thus far taken extension quite literally to crucially involve the periphery of the domain. I might instead reformulate it in a more abstract way, as ‘‘Movement within a level can only be to positions that are uniquely made available at that level,’’ without requiring that those positions be peripheral in that level. I have no concrete suggestion to make, but the issue will recur in later chapters, as there are other examples of this sort to consider. 3.2.2.2 Subjects In this chapter I have shown that a generalized ban on improper movement follows from the architecture of RT, and in chapters 4 and 5, I will show how a generalized notion of the A/A distinction and reconstruction emerges as well. Similarly, I will suggest in this section that there is a generalized notion of subject in RT, with each level defining its own particular kind of subject: theta subject in TS, perhaps identi-
Embedding
81
fied as agent; Case subject in CS, perhaps identified with nominative Case; surface subject in SS, perhaps identified with ‘‘pure’’ EPP subjects in languages with nonnominative subjects like Russian (Lavine 2000) and Icelandic. Even FS may involve some notion of subject. In what sense, though, is there a generalized notion of subject? Isn’t it simply the case that agents are introduced at TS, nominative Case is introduced at CS, and so on, and that there is no intrinsic link among these elements, as the term subject tends to imply? In fact, the representation relation ties these di¤erent notions of subject together: the agent is ‘‘canonically’’ mapped into the nominative NP in CS, which is ‘‘canonically’’ mapped into the ‘‘pure’’ EPP subject position in SS, and so on. I put quotation marks around ‘‘canonically,’’ because that concept is exactly what this book tries to explicate in terms of the notion of shape conservation. So RT o¤ers a natural account of the notion that subjects are agents, nominative, and topicalized: this results from the purely canonical mapping across all the relevant levels, but it also permits deviation from canonicity, of the type shown in chapter 2. In what follows I will try to sort out some of the wealth of what is now known about subjects into properties of di¤erent levels. I cannot pretend to o¤er anything more than suggestions at this point. I do think that RT gives voice to the old intuition that there are several di¤erent notions of subject that get wrapped up into one; at the same time it seems to o¤er the possibility to derive the properties of the di¤erent notions from what is already known about the structure of each level and how it is represented in the next. 3.2.2.2.1 Quirky Subjects For languages like Icelandic at least, it is obvious that there is a notion of subject more ‘‘superficial’’ than Case assignment. I will tentatively identify the level at which this more superficial notion of subject applies as SS, though in the next section I will revise this guess to a level intermediate between SS and CS. As detailed, for example, in Andrews 1982 and Yip, Maling, and Jackendo¤ 1987, Icelandic has a class of verbs that take subjects that are not nominative, but are instead ‘‘quirkily’’ Case marked with dative, accusative, or genitive. (48) Drengina vantar mat. the-boys.acc lacks food.acc (Andrews 1982, 462)
82
Chapter 3
In the appropriate circumstances nominative Case can show up on the direct object when the subject receives a quirky Case. (49) Me´r sy´ndist a´lfur. me.dat thought-saw elf.nom ‘I thought I saw an elf.’ (Andrews 1982, 462) Andrews presents clear evidence that the dative and accusative NPs in these two examples are subjects in the obvious senses. First, quirkily Case-marked NPs can undergo raising, and the quirky Case is preserved under that operation. (50) Hana virDist vanta peninga. her.acc seems to-lack money.acc (Andrews 1982, 464) The verb vanta assigns quirky accusative Case to its subject, and (50) shows that raising preserves the Case. It is only in the case of quirky Case assignment that a raised subject can be Case marked anything but nominative. Second, quirky subject Case marking shows up in Icelandic ECM constructions. (51) Hann telur barninu (ı´ barnaskap sı´num) hafa he believes the-child.dat (in his foolishness) to-have batnaD veikin. recovered-from the-disease.nom (Andrews 1982, 464) Third, quirkily Case-marked subjects are ‘‘controllable’’ subjects. ´ı ritgerDina. (52) E´g i vonast til aD PROi vanta ekki efni I hope to to lack not material for the-thesis (Andrews 1982, 465) As mentioned before, vanta assigns accusative Case to its subject, and as (52) shows, that accusative NP is silent, but understood as coreferential with the nominative matrix NP. Andrews emphasizes that other preverbal NPs, such as topicalized NPs, cannot participate as the pivot NP in an ECM, control, or raising construction. So the quirkily Case-marked subjects really are subjects in a substantive sense. Clearly, the subject in these sentences is at some point within the Caseassigning reach of the verb. I will assume that these Cases are assigned in CS, in the following sorts of structures:
Embedding
83
(53) a. CS: [NPnom [V NPacc ]] b. CS: [NPdat [V NPnom ]] Suppose that SS generates structures like the following: (54) SS: [NPA [V NPB ]] We could regard structures like (54) as Case free, or Case indi¤erent, leading to slightly di¤erent theories. I will arbitrarily pursue the idea that such structures are Case indi¤erent. Surface structures are Case indi¤erent in that A and B in (54) can bear any Case insofar as the well-formedness conditions of SS are concerned; what Cases they turn out to bear in a particular sentence will be determined by what Case structures they are matched up with. The natural shape-conserving isomorphism will identify NPnom with NPA , and NPacc with NPB . It is natural to identify NPA in SS as a ‘‘subject’’ and to inquire about its properties. The notion of subject in CS is obvious: the most externally assigned Case in CS. I will not go into how structures like (53) are generated, but see Harley 1995 for suggestions compatible with proposals made here (see especially the Mechanical Case Rule). Quirky Case marking splits the subject properties in two, a split that corresponds to the two levels CS and SS in RT: specifically, quirky subjects are Case marked (CS), nonagreeing (CS), raisable (SS), and controllable (SS). The controllable subject will be the SS subject (to be revised shortly, when a further level is interposed between CS and SS) regardless of the Case of the NP in CS that is matched to the SS subject. Quirky subjects, on the other hand, do not act like nominative subjects in regard to agreement—quirky subjects do not agree. (55) Verkjanna er taliD ekki g0ta. the-pains.gen is believed not to-be-noticeable (Andrews 1982, 468) Agreement is presumably then a property determined in CS. This arrangement—Case-marked subject and agreement in CS, controllable subject in SS, with representational mapping connecting the two—gives the two notions of subject needed to interact with other phenomena in grammar. CS looks inward, and SS outward. 3.2.2.2.2 EPP Subjects, Raising, and Control In chapter 2 we saw that Russian also has a notion ‘‘subject’’ that is ‘‘beyond Case.’’ In certain circumstances a clause-initial position must be filled, a requirement that
84
Chapter 3
can be evaded only to achieve a special focus e¤ect. Furthermore, the trigger for this movement is not Case, as the NP (or other phrase moved to clause-initial position) already has its own Case, which it brings with it. In a ramified Pollock-style model, such examples can be understood as instances of ‘‘pure’’ EPP, a movement motivated apart from any Case requirement. They are also beyond any requirement of agreement. They are therefore beyond CS, like the Icelandic examples. But in fact, they di¤er from the Icelandic examples in an important way: the pure EPP position in Russian is also not a controllable position. The Russian verb tosˇnit’ ‘feel nauseous’, like the verb zalozˇilo ‘clogged’ discussed in chapter 2, takes no subject argument; but it does, again like zalozˇilo, take an internal accusative object that must be fronted in ‘‘neutral’’ circumstances. I have chosen this verb because it has an animate argument and so could potentially participate in control structures. But in fact that NP argument cannot be controlled. Dzˇona tosˇnilo. John.acc felt-nauseous.neut b. Menja prodolzˇalo tosˇnit’. me.acc continued to-feel-nauseous (Babby 1998a) c. *Ja xocˇu tosˇnit’. I want to-feel-nauseous d. Ja xocˇu, cˇtoby menja tosˇnilo. I want so-that me.acc feel-nauseous (E. Chernishenko, personal communication)
(56) a.
(56a) illustrates the use of the verb in a tensed clause. (56b) shows that the verb is compatible with aspectual predicates. (56c) illustrates the ungrammatical situation where the accusative NP is controlled as the subject of an embedded infinitive. (56d) shows how a Russian speaker would say what (56c) intends to say—using a subjunctive clause with an overt accusative argument, clearly not a control structure. In the view put forward here, (56a), (56b), and the embedded clause of (56d) all have subjectless TSs, which are mapped, at least in the case of (56a) and (56b), to subjectful surface structures, but too late for control; at the relevant level for determining control, they still have no subject. Assuming that control is established at CS (this will be amended shortly), (56a) and (56c) are derived as follows:
Embedding
85
(57) a. TS: [tosˇnilo Dzˇona] ‘ CS: [tosˇnilo Dzˇona] ‘ SS: [Dzˇona tosˇnilo] b. TS: ja xocˇu [tosˇnit’ PRO] ‘ CS: ja xocˇu [tosˇnit’ PRO] ‘ SS: ja xocˇu [PRO tosˇnit’] The infinitive in (57b) does not have a PRO subject until SS, too late for control in CS. I have implemented control in terms of PRO, but that is not essential to the point. What is essential is that at the relevant level, and in the relevant sense, tosˇnit’ does not have a subject. So Russian diverges from Icelandic on this point (cf. Icelandic (52)). In order to assess this di¤erence between Russian and Icelandic, we must fix the level at which control is established. This question can be approached in both RT and the standard ramified Pollock/Cinque-style clause structure. In a theory with such a clause structure, we would conclude that there was a further level of functional structure that could be used to sort out the di¤erent notions of subject, as shown in (58). (58)
This array of conclusions can be modeled in RT by the following subsequence of the representational chain: (59) Case-Agr Structure ‘ Control Structure ‘ Russian-EPP Structure Each representation would have a subject position, which would be mapped or mismapped from a previous level. Control Structure would have mapped into its subject position the highest Case position in CS; the objects defined in Control Structure would be the ones selected by raising and control predicates; and Russian-EPP Structure would have a notion of subject more abstract than (in other words, not limited to) Control Structure. The equivalence of (58) and (59) should be familiar by now, which is of course not to say that the theories in which they arise are equivalent. In both theories certain results must obtain to achieve empirical adequacy: in English all three notions must collapse into one; in Icelandic control subjects must be distinct from Case-Agr subjects; and in Russian all three
86
Chapter 3
notions must be distinct. The two models will achieve these results in different ways. The question for RT is how to graft the subchain in (59) into the model presented in chapter 2. This question could be definitively answered by identifying the ends of (59) with elements of the chapter 2 sequence. A plausible candidate of course is that Case-Agr Structure is CS and that Russian-EPP Structure is SS; but then Control Structure will intervene between CS and SS as a new level. In fact, there is good reason to posit a level between CS and SS. The reasoning is simple: there is a notion of subject that is more abstract or more general than ‘‘most externally assigned Case’’ but narrower than ‘‘topicalized subject.’’ Control and raising seem to require some intermediate notion of subject. In chapter 4 we will see that anaphoric control requires a further notion of subject as well. The question then emerges, do all these phenomena converge on a single notion of intermediate subject? One consideration is the bounding of anaphors. Earlier I identified the English anaphor as a CS anaphor. One reason for positing a level earlier than SS is that CP structure is defined in SS, and elements in SpecC do not seem to be able to antecede English reflexives, as shown earlier (this is simply the well-known generalization that English reflexives must be A-bound). Himself is thus bound by some earlier notion of subject; the question is, is it the CS subject? For English it is di‰cult to say, but for the Icelandic reflexive sig the answer is no. We also know from Icelandic that the control subject is not the agreement subject. For one thing, Icelandic allows control of NPs that would not be assigned nominative Case. Moreover, when nominative Case is assigned to the object and the verb agrees with it, control nevertheless targets a di¤erent NP, the ‘‘subject’’ in some higher (or later) sense. This later subject then is not the agreement subject, which we might take to be the CS ‘‘subject.’’ But neither is it an SS subject, in that it is restricted to A antecedents. This Icelandic anaphor, as well as the English himself, is thus likely to be an element introduced in a level intermediate between CS and SS, a level I will now identify with the label Predicate Structure (PS). We have thus identified (58) as the subsequence CS ‘ PS ‘ SS, so that the model now looks like this: (60) TS ‘ CS ‘ PS ‘ SS ‘ FS ‘ PS b QS
Embedding
87
Assigning himself to PS in English is slightly arbitrary, since it could as easily be assigned to CS; only Icelandic shows evidence of the slightly more abstract notion of subject. But this assignment does allow the immediate annexation of the findings reported in Williams 1980, where the licensing of anaphor binding in English was identified with the notion ‘‘predicate,’’ rather than ‘‘subject’’; in the present context we could return to the notion ‘‘subject,’’ but only if we mean precisely the PS subject. Another phenomenon that might be accounted for in terms of the properties of PS is VP deletion. PS will define a notion of one-place predicate, corresponding to some version of the (traditional) English VP, which is abstracted away from whatever subject it is applied to; this abstracted VP is what is needed to account for so-called sloppy identity. (61) John likes himself and Sam does too. What does Sam do? In the sloppy reading he does not ‘‘like John’’; rather, he ‘‘self-likes,’’ just as John does. There is some controversy whether this is the right view. I will return to the matter in chapter 9, where I fill in some idea of what the semantic values assigned to the objects in each level are. Control and raising themselves must be assigned to some representation earlier than SS, if SS is where CP structure is introduced. Essentially, this follows from the logic of the GBOIM in RT, even though it is not usually considered a case of improper movement. Control and raising are NP Structure rules, in the terminology of Van Riemsdijk and Williams (1981), which entails that they are always relations between pairs of A positions. But by the LEC, they must then be defined in a level that has only A positions; this excludes SS, if SS is the level in which CP structures are introduced. In other words, the following will always be ungrammatical structures: (62) a. *John seems [ . . . to have won]CP . b. *John tried [ . . . to have won]CP . These violate the GBOIM in RT, though not in the familiar application of the term improper movement, as noted earlier. Since we already know from Icelandic that control is defined in a more abstract level than CS, we are left with the conclusion that control is bounded by CS on one side and SS on the other—and so we are left with the conclusion that control is defined at PS as well.
88
Chapter 3
The conclusion about (62a) was established independently in Williams 1994b, where it is argued that CP structure inhibits the transmission of the theta role to the matrix subject. (62b) is a case of obligatory control, in the sense of Williams 1980, where it is demonstrated that there are no cases of obligatory control over CP structure; that is, control of x by John in examples like the following is always an instance of optional or ‘‘arbitrary’’ control: (63) a. John wonders [who [x to talk to]]CP . b. [Who [x to talk to]]CP was not known to John. See Williams 1980 for further discussion, and also see Wurmbrand 1998 for a comprehensive account of the di¤erence between obligatory and optional control exercised across a variety of European languages that delivers exactly this conclusion. But see also Landau 1999, where it is argued that the obligatory/optional distinction is specious. It follows as well that PRO in CP cannot be controllable; that is, derivations like (64) are impossible. (64) *John i wants [PROi [ — to talk to t i ]]CP . This again follows if control is defined at PS. But alongside (64) we do find (65a,b). (65) a. John bought it [OPi [ — to read t i ]]CP . b. A shelf arrived [OPi [ — to put books on t i ]]CP . (65b) appears to involve a control relation between the direct object and the SpecCP of the clause [OP [to to put books on t]]. Why is that relation allowed, if control is consigned to PS? The crucial di¤erence between (65a) and (65b) must be that the clause in (65b) is an adjunct clause. The rules determining the form and meaning of adjunct clauses are patently not confined to PS, as in general wh movement can be involved in the formation of adjuncts (e.g., relative clauses). The question remains, are there any principled grounds for separating ‘‘real’’ control from control of wh-moved operators in adjunct structures? I will postpone this question until it is appropriate to discuss in general when adjuncts are embedded. For the time being we may satisfy ourselves with the idea that ‘‘argumental’’ control is established at PS. Part of the benefit of the LEC can be achieved in a theory with standard clausal architecture by allowing the embedding of structures smaller than CP—that is, ‘‘small clauses.’’ Locality e¤ects and limitations on the
Embedding
89
target of rules can be achieved in this way: embedding structures smaller than CP will give a weaker clause boundary (thus allowing local rules to apply in such a way as to bridge the clause boundary), and omitting CP will at the same time provide a narrower class of targets (the A target SpecC will be excluded, for example). This was the strategy adopted in Williams 1974, where I argued that certain clause types lack CP structure, having only IP or smaller structure (hence, ‘‘small clauses’’) (though this 0 terminology did not exist at the time—CP90s ¼ S70s ; IP90s ¼ S70s ). For example, there are no gerunds with a wh complementizer system, so gerunds cannot be used to form indirect questions. (66) *I wondered [whose book Bill’s having seen t]. What the LEC in RT adds to the small clause theory is that ‘‘smaller’’ corresponds to ‘‘earlier,’’ and this draws in the further property of rules connected with reconstructivity—that is, the details about what movement rules reconstruct for what relations. It also draws in the notion of target type (A vs. A), if each RT level defines di¤erent types of NPs. Small clause theories have no means of connecting locality with these notions of target type and reconstructivity in a theoretically organic way. I will discuss the full set of locality-reconstructivity-target correlations (LRT correlations) in chapter 4. But for the moment I restrict attention to the correlation between target and locality. Wurmbrand (1998) has pursued the small clause methodology for German restructuring verbs; she argues that they lack CP and IP structure, having only something like VP structure, and proposes that their clause-union-like properties result from the smaller structure. This sort of analysis is quite similar to the proposal I am making, in that smaller clause types result in more clause union e¤ects, and it thus explains locality-target correlations—penetrable complements are ones that lack A targets. Cinque (2001) has taken a di¤erent but related tack. He has argued that restructuring verbs actually are themselves functional elements. Suppose that clausal functional structure ¼ F1 > F2 > Fn . Normally, a main verb takes a complement by instantiating Fn , and taking an Fi P as its complement. But Cinque suggests that a restructuring verb is an Fi , and that it takes the rest of the functional chain, Fiþ1 > Fn , as its complement, just as an abstract Fi would. At first glance this would appear to give the same results as the small clause approach: the restructuring verbs will take smaller complements
90
Chapter 3
than normal verbs, in that a restructuring verb identified as an Fi will take as its complement only the tail of the clausal functional chain starting at Fiþ1 and so will in e¤ect take a small clause as its complement. Clause union e¤ects will derive from the fact that the restructuring verb and its complement compose a single clausal functional chain. On the last point, though, Cinque’s proposal is quite di¤erent from the small clause embedding proposal, the RT proposal (with the LEC), and Wurmbrand’s proposal. In these accounts a small clause complement is a separate (if degenerate) subchain from the chain that terminates in the restructuring verb, not a continuation of that subchain. The di¤erence is radical enough that it should be easy to devise decisive tests, though I will not try to do so here. On one count, though, the evidence is very simple, at least in principle. Cinque argues for his proposal in part by pointing out that adverbs that cannot be repeated in a single clause also cannot be repeated in a restructuring verb structure. This of course does not follow at all from a theory in which there is an actual operation of clause reduction. It does follow from Cinque’s proposal if we accept Cinque’s (1998) central idea about the distribution of adverbs: namely, that adverb types are in a one-to-one relation to clausal functional structure, and that the nonrepeatability of adverbs follows from the absence of subcycles in the clausal functional structure. Naturally, this nonrepeatability will carry over to restructuring structures, if the verb and its complement instantiate a single clausal functional structure. The prediction is somewhat di¤erent in a small clause theory of the restructuring predicates. The di¤erence between the two theories is schematized in (67) (RV ¼ restructuring verb; MV ¼ main verb). (67) a. Cinque-style theory F1 > F2 > F3 > F4 > F5 > F6 > F7 RV MV b. Small clause theory F1 > F2 > F3 > F4 > F5 > F6 > F7 > F5 > F6 > F7 RV MV In the Cinque-style structure there is one clausal architecture, F1 . . . F7 ; in the small clause structure the restructuring verb itself is an F7 and takes the small clause F5 > F6 > F7 as its complement. The theories coincide in predicting that adverbs associated with ‘‘high’’ adverbs at F1 . . . F4 cannot be repeated, if we make Cinque’s assumption
Embedding
91
about the relation of adverb type to functional structure, simply because these functional projections occur only once in each structure. But with respect to ‘‘low’’ adverbs, ones associated with F5 . . . F7 , the theories diverge. The Cinque-style structure predicts that they will not be repeatable. The small clause theory predicts that they will be repeatable—once modifying the restructuring verb, and once modifying the main verb. The small clause analysis seems to be borne out in the following example: (68) John quickly made Bill quickly leave. The manner adverb quickly can be seen to modify both the restructuring verb and the main (embedded) verb, and thus the structure (67b) appears to be the correct one. This at least establishes that the small clause analysis is correct for make in English; I have not obtained the facts about Romance restructuring verbs to determine whether they behave as make does in (68). In this section I have taken up some new empirical domains (control, raising, predication, VP deletion, and their interaction with Case) and posited a further level in RT to treat the complex of phenomena that arise when they interact. I cannot blame the reader who at this point is distressed by the proliferation of levels in RT. But I do think that some perspective is required in evaluating the practice involved. Much of the proliferation of levels corresponds, point by point, with proliferation in a ramified Pollock/Cinque-style theory (RP/CT), in that there is at the limit (the worst case) a one-to-one correspondence between levels in RT and functional elements in RP/CT. As I remarked earlier, the worst case deflates my theory, because in this case the parallelism induced by Shape Conservation is trivialized. But for the moment I would focus on the fact that RP/CT lives with the following more or less permanent mystery: there is a fixed universal set of functional elements in some fixed order that defines clause structure, each with its own properties and its own dimensions of linguistic variation. Now, this mystery corresponds exactly to the ramified levels of RT—to the extent that often a revision in the understanding of the role of a functional element in RP/CT will translate straightforwardly into a revision in the understanding of a level in RT. The fact that functional elements are called lexical items in RP/CT and levels in RT should not be allowed to obscure this correspondence. I think the correspondence puts into perspective the methodology that RT naturally gives rise to: solve problems by figuring out what
92
Chapter 3
levels are involved in the phenomena and fix the details of those levels accordingly—in the worst case, standard practice. 3.2.2.2.3 Subject Case and Agreement In this last subsection I will speculate on how an insight of Yip, Maling, and Jackendo¤ (1987) could be expressed in RT. There is a di¤erence between English, on the one hand, and both ergative languages and languages like Icelandic, on the other hand, which has eluded the model so far. In English, the subject, if Case marked, is always Case marked in a way that is independent of the verb it is subject of, and in particular, independent of what Cases are assigned in the VP. But in the other languages mentioned, subject Case marking is dependent on the Case structure of the VP in ways noted earlier. Yip, Maling, and Jackendo¤ suggest that the subject falls within the Case domain of the verb in Icelandic-type languages, whereas in English the subject is in a separate domain; in Icelandic, in their view, there is only one Case domain, whereas in English the clause is divided into two Case domains. A further corollary of this view is that nonsubject nominatives will be found only in Icelandic-type languages. In the present theoretical context we might adapt Yip, Maling, and Jackendo¤ ’s conclusions by treating English nominative as a Case assigned at PS instead of CS. If it is the only Case assigned in PS, and if it is always assigned to the subject defined at that level, then there will be no opportunity for it to mix with the rest of the Case system, which is assigned at CS. Under this arrangement we no longer have a ‘‘single-level’’ Case theory. But perhaps it is arbitrary to expect that in the first place. This arrangement makes an interesting prediction about expletives. The simplest account of expletives is to treat them as ‘‘formal’’ Case holders; that is, they occupy a Case position in CS that does not correspond to a theta position in TS. But in fact, we might consider confining expletives to PS; in that case we would expect (subject) expletives only in languages like English, which has an ‘‘absolute’’ nominative subject requirement. I do not know if the facts will bear out this conclusion. But German is clearly a language of the Icelandic type with regard to Case assignment. (69) Mir ist geholfen. I.dat is helped ‘I was helped.’ Since the dative subject in (69) is a controllable nonnominative subject, the remarks about Icelandic apply here. Moreover, German does not seem to have a subject expletive.
Embedding
93
(70) a. Es wurde getanzt. it was danced ‘There was dancing.’ b. Gestern wurde (*es) getanzt. yesterday was it danced c. Ich glaube dass (*es) getanzt wurde. I believe that it was danced The expletive es appears only in matrix clauses, presumably because it is not a subject expletive, but a fill-in for Topic position; therefore, because of the well-known matrix/subordinate di¤erence in German clausal syntax—topicalization and V-to-C movement apply only in the matrix—it will play a role only in matrix clauses. So, even the notion ‘‘expletive’’ needs to be generalized across the RT levels. The lesson from this section will become familiar: a previously unitary concept is generalized across the levels of RT. In this case it is the notion ‘‘subject,’’ di‰cult to define, but now decomposed into components: agreement subject, control subject, thematic subject, Case subject, pure EPP subject, and so on. But the decomposition brings more than its parts, because these notions are ordered with respect to one another by the asymmetric representation relation. The ordering allows us to say, for example, that Icelandic quirkily Case-marked subjects are ‘‘earlier’’ than Russian pure EPP subjects and therefore liable to control.
This page intentionally left blank
Chapter 4 Anaphora
The overall typology of anaphoric elements can be reinterpreted in terms of the di¤erent levels of RT. Associating di¤erent anaphors with di¤erent levels interacts with the LEC to fix properties of anaphoric items in a way that I think is unique. In a sense it is a generalization of the method used to explain the BOIM in chapter 3. The same method will be applied more broadly still in chapters 5 and 6. The Level Blocking Principle introduced in chapter 3 will play an important role in the discussion as well. According to this principle, if one and the same operation can take place in two di¤erent levels, the application in the early level blocks the application in the later level. If anaphors are introduced at every level, the applicability of such a principle will be obvious. 4.1
The Variable Locality of Anaphors
It emerged in the 1980s, beginning with Koster 1985, that there is a hierarchy of anaphoric elements, from ones that must find their antecedents at very close range, to those whose antecedents can be very far away. It will be natural to associate these with the levels of RT in such a way that the more long-distance types are assigned to later structures, with the hope that the ranges of the di¤erent types can be made to follow from the ‘‘sizes’’ of the objects defined at each level. In this sense RT levels index the set of anaphors in the same way that they index embedding types as shown in chapter 3. Here and in chapter 5 we will see that RT, with the LEC, draws together three di¤erent properties of syntactic relations: their locality, their reconstructivity, and their target (where target refers to choice of A or A antecedent, generalized in a way to be
96
Chapter 4
suggested in chapter 5). I will refer to the correlations among these three di¤erent aspects of syntactic relation as the LRT correlations (localityreconstructivity-target). Although di¤erent aspects of this three-way correlation have been identified in previous work, it seems to me that the whole of it has not been drawn together theoretically, nor has the scope of the generalization involved been well delineated. I believe it is a distinctive feature of RT that it forces a very strong generalized version of the correlation. For example, RT makes explicit the following correlation about how locality and type of possible antecedent covary. Traditionally, it has been assumed that an anaphor must have an A position antecedent. For example, it has been held that a wh-movement-derived antecedent is not available for English reflexives (except of course under reconstruction). Thus: (1) a. *John wondered [which man] i pictures of himself i convinced Mary that she should investigate t. b. John wondered which man i Bill thought [t i would like himself ]. In (1b) the reflexive is bound to which man, but via its A position trace, which c-commands and is local to it. In (1a), however, this is impossible; the trace of which man does not c-command the anaphor, and (most importantly) which man is in an A position and so is ineligible itself as antecedent. But in RT, the notion of A position is relativized. Each representation relation gives rise to a unique A/A distinction: positions at level Ri are A positions with respect to positions at level Riþ1 . As a result, we might expect anaphors at di¤erent levels to behave di¤erently; specifically, we might expect anaphors at later levels to have an apparently ‘‘expanded’’ notion of potential antecedent (target). Furthermore, as we move ‘‘rightward’’ in the RT model, this expanding class of antecedents should be correlated with loosening locality restrictions, simply because the structures get ‘‘bigger.’’ Thus, the discussion in this chapter helps substantiate the LRT correlations made possible by the LEC in chapter 3, and first put to analytic use there to generalize and explain the BOIM. The correlations are purely empirical, and not necessary (apart from the theory that predicts them, of course). Consider, for example, Japanese zibun and Korean caki. As is well known, these anaphors are not bounded by subjects as English himself is, or in fact by any sort of clause type; nor are they bounded by Subjacency.
Anaphora
97
(2) John i -i Bill j -ekey Maryk -ka caki i=j=k -lul cohahanta-ko John-nom Bill-dat Mary-nom self-acc like-compl malhayssta. told ‘John i told Bill j that Maryk likes selfi=j=k .’ (Korean; Gill 2001, 1) As a consequence, in RT caki must be an anaphor that is introduced in a late level, perhaps SS or FS, the levels at which tensed-clause embedding takes place. As an SS (or FS) anaphor, it will take as its antecedents the elements that are developed at SS (or FS), among them the Topic and the Focus of the utterance. So, caki should be able to be bound by a class of antecedents not available for the English reflexive, namely, A antecedents; and this prediction seems to be borne out. (3) John i -un ttal-i caki i -pota ki-ka te kuta. John-top daughter-nom self-than height-nom more is-tall ‘As for John i , (his) daughter is taller than self i .’ (Korean; Gill 2001, 1) In this structure caki is bound from the derived A Topic position. This is possible because caki is licensed at SS, where such elements as Topic are introduced. Similar facts hold for zibun in Japanese and ziji in Chinese. Such licensing is impossible in English, as English reflexives are licensed in CS (or at least, before SS), and Topics don’t exist in that level. (4) *(As for) [JohnT ] i . . . the book for himself i to read was given to t by Bill. In RT this property of the English reflexive is not a free parameter, but is determined by another di¤erence between zibun and himself. Namely, subject opacity holds for himself, but not zibun, because in the RT model each of these properties is determined by what level the reflexive is introduced in, so only certain combinations of properties is possible. In addition to zibun, Japanese has another reflexive, zibunzisin, which is essentially like English reflexive himself, both in locality and in type of antecedent (A/A). (5) John i -wa [[Bill j -ga Maryk -ni zibunzisin i=j= k -o subete sasageta] John-top Bill-nom Mary-dat himself-acc all devote to] omotta. that thought ‘John thought that Bill devoted all (of ) himself to Mary.’
98
Chapter 4
Latin also shows a correlation between distance and type of antecedent. According to facts and analysis provided in Benedicto 1991, the Latin se anaphor has both a greater scope and a greater class of possible antecedents than standard anaphors. First, reflexive binding of se (dative sibi here) can penetrate finite clause boundaries. (6) Ciceroi e¤ecerat [ut Quintus Curius consilia Cicero.nom had-achieved comp Quintus Curius.nom designs.acc Catalina sibi i proderet]. Catalina.gen refl.dat reveal.subj ‘Cicero had induced Quintus Curius to reveal Cataline’s designs to him.’ (Sall, Cat., 26.3; from Benedicto 1991, (1)) In fact, it can even penetrate into finite relative clauses. (7) Epaminondas i [ei [qui sibi i ex lege Epaminondas.nom him.dat that.nom refl.dat by law.abl praetor successerat]] exercitum non tradidit. praetor.nom succeeded.ind army.acc not transferred ‘Epaminondas did not transfer the army to the one who succeeded him as a praetor according to the law.’ This is especially noteworthy since it casts doubt on treating long-distance reflexivization in terms of movement, as several accounts have proposed. Given this, we would expect the reflexive to occur in late levels. From that it would follow that it could target A antecedents. Citing the following examples, Benedicto argues that this is exactly the case: (8) Canumi tam fida custodia quid significat aliud nisi dogs.gen such trusty watchfulness.nom what mean else except [se i ad hominem commoditates esse generatos]? refl.acc for men.gen comfort.acc be created.inf ‘The trusty watchfulness of the dogs, . . . what else does it mean, except that they were created for human comfort?’ ‘Cic., Nat. deor, 2.158; from Benedicto 1991, (24)) (9) A Caesare i ulade liberaliter inuitor [sibi i ut sim by Caesar.abl very generously am-invited refl.dat comp be.subj legatus]. legate.nom ‘Caesar most liberally invites me to take a place on his personal sta¤.’ (Cic., Att., 2.18.3; from Benedicto 1991, (25))
Anaphora
99
(10) A Curione mihi nuntiatum est [eum ad me uenire]. by Curio.abl me.dat announced was he.acc to me.acc come.inf ‘It was announced to me by Curio that he was coming to me.’ (Benedicto 1991, (33)) Benedicto makes the point that normally passive by phrases cannot control reflexives in general. The fact that the by phrase in (10) is the antecedent of the reflexive suggests that it can be so solely by virtue of its role as a topicalized NP, which of course is consistent as well with its surface position. In RT terms this means that the reflexives in these examples are directly bound by the Topic position, not bound to the trace of the Topic position. (11) Anticipating the discussion of Reinhart and Reuland’s (1993) theory of anaphora in section 4.3, I will note that it seems unlikely that Benedicto’s conclusions can be rewritten in terms of logophoricity (unless logophoric is redefined to correspond to topic-anteceded ). Important to evaluating RT in this connection is that in the absence of a theory there is no logical connection between locality and type of antecedent, in either direction. Thus, the locality of himself does not predict the ungrammaticality of (1a), as no subject interrupts the anaphorantecedent relation. In the other direction, there is nothing about the lack of locality of zibun that directly predicts that it could be bound by A antecedents. One can easily imagine a language in which, for example, (2) is grammatical but (3) is not—all one would need is the ability to independently specify the locality and the antecedent class for a given anaphor. RT does not allow this, as both properties of an anaphor derive from the particular level the anaphor is assigned to. Assigning an anaphor to a level simultaneously determines its locality (its relation to its antecedent will be restricted to spanning the objects that are manufactured at that level) and its antecedent class (it will take as antecedents the elements that appear in structures at that level). And it does so in a completely generalized ‘‘graded’’ or indexed way: the larger the locality domain, the wider the class of antecedents. In this regard RT is more generous than other theories with only the A/A distinction. But that generosity is apparently needed, and it is compensated by the locality-type correlation. In section 4.3 I will suggest that the flaws in Reinhart and Reuland’s (1993) theory
100
Chapter 4
stem mainly from its having only a binary distinction for types of anaphors (their ‘‘reflexive predicate/logophoric pronoun’’ distinction) instead of the indexed notion suggested here. In advance of looking at any data, we can in fact sketch what we would expect to be the properties of anaphors at di¤erent levels in RT. These are all consequences of the LRT correlations, which in turn follow from the architecture of the model. If there is an anaphor associated with TS, for example, it will relate coarguments of a single predicate, and nothing else, because the structures of TS are verbs combined with arguments. If there are complex theta structures for clause union e¤ects, we would expect the antecedent-anaphor relation for these anaphors to be able to span these complex units. In English the prefix self- has exactly this property: it can relate coarguments of a single predicate, but nothing further away, whether or not a subject intervenes. Its extreme locality can best be appreciated by comparing it with the English syntactic reflexives him/her/ it/oneself, which permit the following pattern of antecedents: (12) a. Stories about the destruction of oneself can be amusing. b. ‘x’s stories about y’s destruction of x’ c. ‘x’s stories about y’s destruction of y’ (12b) and (12c) are both possible interpretations of (12a); but with the anaphoric prefix self-, instead, only the reading corresponding to (12c) is available. (13) a. Self-destruction stories can be amusing. b. *‘x’s stories about y’s destruction of x’ c. ‘x’s stories about y’s destruction of y’ (13b) represents the case where the antecedent is not a coargument of the reflexive; such cases are impossible for self-, but possible for oneself. A first guess about what is wrong with (13b) would be that destruction had a covert opacity-inducing subject; but that account would fail to explain why (12b) is not parallel to (13b). In the context of RT, if we assign self- to the earliest level, TS, the observed behavior is expected. Anaphors like himself and oneself will be assigned to (possibly) higher levels. The assignment of self- to the lowest level is probably not accidental; being an a‰x, it has access only to TS, since the levels higher than TS in RT play no role in morphology, as I proposed in the account of the Mirror Principle in chapter 1. This conclusion holds only for what is traditionally called derivational morphol-
Anaphora
101
ogy. Inflectional morphology clearly must have access to all levels. In chapter 8 I reconstruct the traditional distinction in RT terms. There are nona‰xal syntactic reflexives that also seem to be confined to TS. For example, Baker (1985) reports that the reflexive in Chi-mwi:ni is a free reflexive like English himself, but it is confined to direct object position and can take only the immediate subject as argument. And in fact one of the Dutch reflexives discussed in the next section is probably another case of this kind. 4.2
Dutch zich and zichzelf (Koster 1985)
The Dutch reflexives zich and zichzelf, discussed in detail by Koster (1985), can be distinguished by assigning them to di¤erent RT levels. (14a–d) are Koster’s examples showing the di¤erence in locality between the two. (14) a. *Max haat zich. Max hates self ‘Max hates himself.’ b. Max hoorde mij over zich praten. Max heard me about self talk ‘Max heard me talk about him.’ c. Max haat zichzelf. Max hates selfself ‘Max hates himself.’ d. *Max hoorde mij over zichzelf praten. Max heard me about selfself talk ‘Max heard me talk about him.’ (14a) shows that zich cannot take a clausemate antecedent, and (14c) shows that zichzelf can. We may achieve an adequate description of these facts by assigning the two reflexives to di¤erent levels: zichzelf to TS, and zich to CS. These assignments are warranted if zich approximates English himself and zichzelf approximates English self-, given the discussion in section 4.1. These assignments explain (14d), insofar as zichzelf, being a TS anaphor, is restricted to coargument antecedents; but they do not, strictly speaking, explain (14a)—that is, why zich is ungrammatical with a coargument antecedent. An obvious first guess is that zich is subject to some kind of Condition B and is too close to Max to satisfy that condition.
102
Chapter 4
However, I think it would be more interesting to explore the idea that zich and zichzelf are in a blocking relation with one another: where one is used, the other cannot be. (See Williams 1997 for a general discussion of the role of blocking in anaphora.) As in other blocking relations, the direction of blocking is determined by the licensing conditions that hold for the two items in the blocking relation; when the licensing conditions associated with one of the items are strictly narrower than the licensing conditions associated with the other, the former will block the latter when those narrower conditions obtain. In the case of zich and zichzelf it is obvious that zichzelf will block zich, because the conditions for licensing TS anaphors are narrower than the conditions for licensing CS anaphors: TS anaphors are limited to coargument antecedents, while CS anaphors are not so limited, but could include them. The existence of (14c), then, is the reason that (14a) is ungrammatical. If this is correct, then Condition B will not be relevant, a conclusion that I will demonstrate again shortly, on di¤erent grounds. In general, if a given ‘‘process’’ can occur at more than one level, then an application in an earlier level will block an application in a later level. I would hope that the reasoning will always reduce to the asymmetry of the representation relation, as suggested by the last part of the previous paragraph in connection with reflexives at di¤erent levels, but I have not thought through the problem su‰ciently to be sure that the logic appealed to there will be available in all cases. This kind of blocking of an early level by a late level is frequent enough to deserve a name, so I will call it level blocking. It will be relevant again in chapters 5 and 6 in connection with scrambling. The problem for applying the principle in general is identifying instances of the ‘‘same process,’’ a murky concept; it is in fact what makes blocking murky in general, but no more so here. Murky, but inescapable, apparently. Not only is blocking a more interesting theoretical possibility than a Condition B solution to the ungrammaticality of (14a); there are also empirical obstacles to implementing the latter solution. With some inherently reflexive verbs, zich is permitted in a clausemate context. (15) a. Max wast zich. Max washes self ‘Max washes himself.’ b. Max schaamt zich. Max shames self ‘Max is ashamed.’
Anaphora
103
This strongly suggests that zich is not subject to anything like Condition B; if it were, there would be no account of the di¤erence between (14a) and (15a), since they have identical syntactic structure. But the blocking account of the antilocality of zich at least hints where to look for the answer. Under the blocking account (15a) would be grammatical only if for some reason zichzelf is not permitted with these verbs; and in fact it is not. (16) *Max schaamt zichzelf. Why is this so? Perhaps these verbs are only ‘‘formally’’ reflexive; that is, perhaps they are, thematically speaking, intransitive verbs. In that case there would be no possibility of introducing the reflexive in TS, as TS consists purely of theta roles, and so nothing corresponding to the position of the reflexive. The reflexive is therefore a kind of expletive element in such cases. Having no theta role, it cannot be introduced until CS, when the nonthematic but Case-marked direct object is introduced. But since zichzelf is eligible only for coargument anaphora, it cannot be used, enabling zich to appear. This explanation is satisfying because it relates the thematic structure of the verb to the already established blocking relation between the two reflexives in the only way that they could be related. In fact, this conclusion carries over to English, which also has ‘‘formal’’ or ‘‘expletive’’ reflexives. Consider (17). (17) John behaved himself. As noted earlier, the English form self-, being a prefix, must be resolved in TS; therefore, it cannot participate in formal reflexive structures. (18) *self-behavior Admittedly, the import of (18) is undercut somewhat by the fact that even the English CS full reflexive form is blocked in such contexts. (19) *behavior/*shame/*perjury of oneself It seems that formal reflexives are systematically blocked in nominalizations. But this again is reasonable in RT, after all, since nominalizations do not have Case, and formal anaphors are pure ‘‘Case holders.’’ Whatever the preposition of governs in nominalizations must be thematic. At first glance wast in (15a) appears to exemplify a third pattern, different from those of both haat and schaamt; but it is actually simply am-
104
Chapter 4
biguous between the two. Alongside its intransitive use (as in Max wast) it also has a transitive use; furthermore, the intransitive use has a formal reflexive, just like schaamt, so that wast merely appears to take both reflexives, a situation inconsistent with the use of level blocking. English wash shows the same ambiguity as Dutch wast, except that the intransitive in English perhaps does not take a formal reflexive. (20) a. Max wast zich/zichzelf. b. Max washed. c. Max washed himself. 4.3
Reconstruing Reinhart and Reuland’s (1993) Findings
If the proposals made thus far are correct, we can elaborate on a distinction used by Reinhart and Reuland (1993) (R&R) to account for the behavior of di¤erent kinds of anaphora. In the end we will reject their theory of anaphora, because it is incompatible with the one developed here, and because of its own unresolved flaws. Our model will more closely resemble Koster’s (1985), which drew something like the same distinction that R&R’s model draws, but without its limitations. R&R identify circumstances in which the locality of binding is suspended, as in (21a). (21) a. John i thinks that Mary likes Sue and himself i . b. *John i thinks that Mary likes himself i . The di¤erence between (21a) and (21b) is that in (21b) the reflexive is in an argument position, whereas in (21a) it is in only part of an argument position. R&R conclude that there are two types of anaphor: lexical (SELF, in R&R’s terms) and logophoric (SE). Lexical anaphora holds of coarguments of a single predicate and hence occurs ‘‘in the lexicon,’’ whereas logophoric anaphora is a discourse-level business, the same business that resolves pronoun antecedence. I think the fundamental problem with R&R’s account is that there is nothing in the account intermediate between lexical and logophoric anaphora. In Dutch, for example, zichzelf does seem to hold roughly of coarguments, as we saw, and hence could be construed as lexical. However, not only is zich not a discourse-level anaphor, it in fact has rather tight locality restrictions, something like English himself—a property R&R’s account will entirely miss.
Anaphora
105
The binary distinction made in R&R’s account leads to two other problems as well. The first problem is posed by ECM constructions, which show opacity e¤ects even though the reflexive is not a coargument of the antecedent. (22) a. John believes [himself to have won]. b. John thinks that Mary believes herself to have won. R&R devise the notion ‘‘syntactic coargument’’ for this case: believes assigns Case to himself and a theta role to John, and so they are coarguments in some extended sense. In fact, I should not quarrel too much with this conclusion, as it corresponds so closely to my own, in that one could call an antecedent in RT’s CS a ‘‘syntactic’’ argument. But even so, I think R&R’s account su¤ers here mainly from having only a binary distinction. Once their account is revised so that ‘‘syntactic coargument’’ replaces ‘‘thematic coargument’’ as the determinant of reflexive antecedence, it becomes impossible to distinguish himself from self- in English, where indeed coargument in the narrowest sense (theta-theoretic coargument) seems to be the governing relation. Consider (23), for example. (23) *John self-believes to have left. Here self- cannot correspond to the ‘‘syntactic’’ object of believes. Of course, one might stipulate some property of believe that would rule (23) out, but that would fail to express the very likely and interesting conclusion that such cases are impossible. In the RT analysis of ECM presented earlier, ECM arises from mismapping TS to CS. (24) a. TS: [John believes x] þ [himself to have left] ¼ b. CS: John believes himself [to have left] If anaphora applies in CS (or perhaps in CS and TS), then in CS it will relate the two Case positions John and himself. The locality condition will apply in CS and so will be bounded by the subject of believe, as that is the highest NP available in the Case domain of that verb. The other problem R&R’s account never satisfactorily resolves has to do with reciprocals. Reciprocals in direct object position show familiar locality e¤ects; but other reciprocals, while appearing in a broader range of contexts, do not escape the utterance altogether in finding their antecedents.
106
Chapter 4
(25) John and Mary think pictures of each other are in the post o‰ce. In (26a) each other is clearly not a coargument of John and Mary; but from this we cannot conclude that each other is logophoric, as (26b) shows. (26) a. [John and Mary] i called on each other at the same time. b. *[Each other] i ’s houses consequently had a forlorn and deserted look. What is needed again is something intermediate between coargument and discourse level; CS or SS application of reciprocal interpretation will give the right result. In fact, examples like (25), despite not being constrained to coarguments, nevertheless show strict locality e¤ects. (27) *John and Mary think that Bill wants pictures of each other to be in the post o‰ce. R&R’s account predicts that such cases should be grammatical: since the antecedent cannot by any means be construed as a coargument of its predicate, it must be taken to be a logophoric pronoun and should then show no opacity e¤ects. Stipulating that long-distance reflexives involve movement does not help either, as we saw in the case of Latin that longdistance reflexivization penetrated the most hardened islands. A correct summary of the situation in English must include the fact that there are at least two di¤erent uses of the reflexive that cannot be construed as coargument-antecedent-taking cases: one like (27) in which the reflexive occurs in an argument position, in which case it shows subject opacity e¤ects; and another involving coordinated reflexives as in (21a) (Sue and himself ), which do not show such opacity e¤ects. R&R’s theory cannot distinguish these two, as it makes only a binary distinction and neither of these qualifies as the ‘‘reflexive predicate’’ case. I will outline the RT account of such cases in the next section. 4.4
Predicate Structure
To resolve a problem with binding theory in RT, it is necessary to posit a level between CS and SS. If that level is identified as the level at which the subject-predicate relation is established, further puzzles can be tentatively resolved, and new di¤erences between the RT treatment and the standard treatment of binding theory emerge.
Anaphora
107
4.4.1 The Level of Binding Theory The behavior of German binding theory will compel us to interpose a level between CS and SS, by the following reasoning. Short-distance scrambling in German takes place after Case assignment, because the scrambled NPs retain their Cases. Short scrambling takes place before binding theory applies, because binding theory relations are computed strictly on the output of short scrambling. Furthermore, binding theory (in English) applies strictly before wh movement. Assuming wh movement takes place in SS, we have the following implications: (28) Case CS
E) c. ~&Dareka-ni daremo-o John-ga syookaisita. (b > E) d. &Daremo-o John-ga dareka-ni syookaisita. (Yatsushiro 1996, as reported in Richards 1997, 82) (36a) and (36b) are as expected. However, (36c) is surprising on the Checking Theory account: if (36c) has the representation in (37), we expect it to be ambiguous, which it is not. (37) [NP1 -ni [NP2 -o [NP-ga t1 t2 V]]] The reason is that if both NP1 and NP2 are scopally ambiguous between deep and derived positions, then either scope order is possible: NP1 > NP2 if NP1 takes scope from the derived position and NP2 takes scope from the deep position, and the reverse if the reverse. But the fact is simply that NP2 cannot take scope over NP1 . Importantly, though, if the two NPs crossing the subject switch their relative order, then ambiguity results again. (38) a. &Daremo-o dareka-ni John-ga syookaisita. b. ~&Dareka-ni daremo-o John-ga syookaisita. (b > E) The problem for Checking Theory is that it atomizes the NP movement relations here, as each NP is checked independently of the others. It therefore cannot account for e¤ects that arise from the relative ‘‘movement’’ of two NPs with respect to each other, just the kinds of e¤ects for which RT was envisaged. Richards (1997) uses such examples to promote the idea that Superiority holds for A movement, once ambiguity is controlled for. That is, (38b)
Superiority and Movement
159
illustrates A movement obeying Superiority, and (38a) doesn’t count, being ambiguous. In Richards’s account of (38a) the two NPs move to the same functional node, and Superiority dictates their relative order. In (38b) they again move to the same node, in the same order, but an extra higher attractor (EXTRA in (39)) attracts NP-o to a higher position, giving rise to ambiguity; since the extra attractor only attracts NP-o, Superiority does not prevent this movement (see Richards 1997 for formulation of the relevant principles). (39) [NP2 -o [NP1 -ni [t2 [NP-ga t1 t2 V]]]]EXTRA Positing extra attractors does not solve this problem, though. In principle, there is now no reason not to posit yet another attractor that attracts NP-ni over the (derived) position of NP-o, thus again predicting that the order ‘‘NP-ni NP-o NP-ga’’ will be ambiguous. (40) [NP1 -ni [NP2 -o [t1 [t2 [NP-ga t1 t2 V]]]]EXTRA1 ]EXTRA2 But we know from (36c) that it is not. The basic generalization about Japanese quantifiers is, ‘‘If two NPs cross, ambiguity results,’’ understood in such a way that NP-ni and NP-o do not cross in (36c), but do cross in (38a). But Checking Theory, because it atomizes movement relations, cannot deal with cases where several things move in concert. It must be augmented with an extrinsic principle that controls either the input or the output of the derivation in a way that has nothing to do with the operation of Checking Theory itself. In this way, Checking Theory can be shielded against these and other related empirical challenges, but at the cost of having less and less to say about how these systems actually work. How are these facts to be accounted for in RT? Let us suppose that the orders of NP-ga, NP-ni, and NP-o, are represented in CS by the following structure: (41) [NP-ga [NP-ni [NP-o V]]] And let us suppose that SS can be generated by general rules, such as (42). (42) S ! [NP1 [NP2 [NP3 V]]] Suppose further that SS uniquely determines quantifier scope; that is, SS c QS is strictly enforced. The problem then reduces to this: how can (41) be mapped onto (42) isomorphically? There is only one way, of course: NP-ga ! NP1 , and so
160
Chapter 6
on. This is why (41) as a surface structure is not ambiguous. The other mappings are misrepresentations of (41). The following two are the misrepresentations that give rise to (38a) and (38b), respectively: (43)
(44)
(43) shows why ‘‘NP-ni NP-o NP-ga’’ has wide scope for NP-ni, and (44) shows why ‘‘NP-o NP-ni NP-ga’’ has wide scope for NP-o. Now the question remains, why can NP-ni have wide scope in ‘‘NP-o NP-ni NP-ga V,’’ when NP-o does not have wide scope in ‘‘NP-ni NP-o NP-ga’’? The answer must be that ‘‘. . . -o . . . -ni . . .’’ is more distant from CS than ‘‘. . . -ni . . . -o . . .’’ and so is warranted only if a di¤erence in meaning is achieved—that is, only if the further mismatch is compensated by a closer match to QS. At least, that is the logic of RT. 6.2.2.4 Long versus Short Scrambling in Hungarian The standard treatment of long versus short scrambling facts is to posit two di¤erent movements, A and A, and (or or) two di¤erent positions, A and A, which are their respective targets. This is the strategy adopted in Checking Theories, for example. But in the context of RT, we could instead propose a single position that is the ‘‘target’’ of two di¤erent ‘‘movements’’: a ‘‘virtual’’ movement, which arises as a part of the (mis)representation of one level by another, and an A movement, which is intralevel SS movement. Analysis of WCO and Superiority facts in Hungarian suggests that this must be so. There appears to be only one Focus position, which appears just before the verb and whose filling triggers verbal particle postposing; but WCO and Superiority violations arise only when the position is filled by long wh movement. (45) a.
Kiti szeret az anyjai ti? who.acc loves the mother-his b. *Kiti gondol az anyjai hogy Mari szeret t i ? who.acc thinks the mother-his that Mari loves (E´. Kiss 1989, 208)
Superiority and Movement
161
This suggests that we cannot associate the preverbal Focus position with either A or A status; that is, we cannot call it an A or an A position, independent of when the movement takes place. In RT we need only set things up in Hungarian so that the Focus position is accessible some time before CP embedding takes place. There is no need to fix ahead of time how a given position in SS will be filled. 6.3
Masked Scrambling
The worrisome thing about the last derivation posited above for Bulgarian (see (31)) is that there is an ‘‘invisible’’ structure in which the independent wh word is not superior to the rest. A similar situation arose in the discussion of Japanese quantifier scrambling in section 6.2.2.3. Perhaps ‘‘invisible’’ scramblings are not allowed. If so, then derivation (31) will not occur, but another one will be allowed, as the scrambling in that case is not invisible. (46)
A ‘‘paradox’’ arises from having both A and A movements available for the same ‘‘process.’’ The problem is that a sentence in which A movement is supposed to have applied can always be viewed instead as the outcome of an application of A movement, followed by the application of A movement; the surface order will be the same, but the interpretive e¤ects will be di¤erent. As we saw in (21), repeated here, in SerboCroatian short scrambling precedes dependent-independent variable fixing, whereas long scrambling follows it (and so reconstructs for it). (47) a.
Ko je koga vidjeo? who aux whom saw b. Koga je ko vidjeo? c. Ko si koga tvrdio da je istukao? who aux whom claimed that t aux beaten t d. *Koga si ko tvrdio da je istukao?
But what prevents a derivation of (47d) in which first the two wh words switch positions in the lower clause by short scrambling, and then the
162
Chapter 6
same wh words move to the A position in the higher clause, thus nullifying Superiority e¤ects? (48) D Structure ! A scrambling ! A movement (Of course, in RT scrambling is not classical movement; I put the matter in classical terms here because the issue is not specific to RT.) If such a derivation were possible, (47d) should be grammatical. We must prevent A scrambling from applying to the wh words in the lower clause, or at least prevent wh movement from applying to its output. There is a subtlety in determining what would count toward making a scrambling ‘‘invisible.’’ Certainly part of it has to do with whether the surface string shows the scrambling order; if it does, then the scrambling is certainly not invisible. However, there is another way in which a scrambling, even one that did not manifest itself in the surface string, could achieve visibility: it could induce some e¤ect in the interpretation. In fact, visibility is a matter of interpretation anyway. The scrambling is visible in the obvious sense if there is some sign of it in the phonological interpretation; therefore, one could easily imagine that the semantic interpretation could provide some sign as well, in the form of an e¤ect on meaning. The crucial case of this type would be the one in which long scrambling appeared to give rise to WCO repair, by virtue of a prior, ‘‘stringinvisible’’ short scrambling. (49) CS: [NP1 NP2 V]S1 PS: [NP2 NP1 V]S1 SS: [NP2 [NP1 V]S1 ]S2 Scrambling takes place at both CS ‘ PS and PS ‘ SS; the CS ‘ PS scrambling is string invisible. If this derivation is allowed, then the crucial question is, what relation does it bear to the derivation in (50), with which it coincides in both CS and SS? (50) CS: [NP1 NP2 V]S1 PS: [NP1 NP2 V]S1 SS: [NP2 [NP1 V]S1 ]S2 Although (49) and (50) are string indistinguishable, they might di¤er in interpretation. The di¤erence would center on the interpretive properties of PS and (under the assumptions we have made) would include the bound variable dependencies that WCO governs. A single long scrambling will appear to reconstruct for such dependencies; but a short
Superiority and Movement
163
scrambling, followed by a long scrambling, will not. So the crucial question is, if a language has both short and long scrambling, and the short scrambling has interpretive e¤ects, are all long scramblings ambiguous? In the cases examined in this book, it appears they are not. From this we would tentatively conclude that the prohibition against ‘‘invisible’’ scrambling is a prohibition against ‘‘string-invisible’’ scrambling. However, I regard this as an open question, and it is entirely possible that the correct answer is more complicated than the present discussion suggests: it might, for example, depend on how evident the semantic e¤ect is. In other words, there is no conclusion about invisible scrambling that follows from the central tenets of RT, and in fact a number of di¤erent answers to the questions about it are compatible with those tenets. In what follows I will explore some considerations suggesting that ‘‘stringinvisible’’ scrambling should not be allowed, but further research could uncover a more complicated situation. We can in fact observe the behavior of masked scrambling in English. In the context of RT, scrambled orders (i.e., ones that deviate from TS and later structure) are marked; and such deviation can be tolerated only to achieve isomorphy somewhere else. But marked orders must be ‘‘visible’’; that is, there must be some way to reconstruct them. But if the region in which the marked order occurs has been evacuated, then that evidence is gone; for example, once wh movement has taken place in (51), no evidence remains to show which of the two orders was instantiated in the lower clause. In such a case we assume the unmarked order, as it has the lowest ‘‘energy state.’’ (51) a. whi . . . t i NP b. whi . . . NP t i (52) Assume Lowest Energy State If there is no evidence for the marked order, assume the unmarked order. There is some evidence from English for such a supposition. The evidence comes from the interaction of scrambling and contraction. The known law governing contraction is (53), illustrated in (54). (53) Don’t contract right before an extraction or ellipsis site. (54) a. Bill’s in the garage. b. Do you know where Bill is t? c. *Do you know where Bill’s t?
164
Chapter 6
But because English has scrambling that can potentially move extraction sites away from contractions, we can see how (53) interacts with such scramblings. The ‘‘normal’’ order for a series of time specifications within a clause runs from the smallest scale to the largest. (55) The meeting is at 2:00 p.m. on Thursdays in October in odd years ... Any of these time specifications can be questioned. (56) a. When is the meeting at 2:00 p.m. t? (Answer: on Thursday) b. When is the meeting t on Thursday? (Answer: at 2:00) Furthermore, the time specifications can be scrambled, up to ambiguity. (57) The meeting is on Thursdays at 2:00 p.m. Crucially, though, scrambling cannot be used to evade the restriction on contraction. (58) a. b. c. d. e.
Do you know 2 p.m.) *Do you know 2 p.m.) Do you know Thursday) Do you know Thursday) *Do you know 2 p.m.)
when the meeting is t on Thursday? (Answer: at when the meeting’s t on Thursday? (Answer: at when the meeting is at 2:00 p.m. t? (Answer: on when the meeting’s at 2:00 p.m. t? (Answer: on when the meeting’s on Thursday t? (Answer: at
(58b) clearly runs afoul of the trace contraction law (53); but why is (58e) not a possible structure that would give the appearance that cases like (58b) had evaded the law? (58e) must be eliminated, and a prohibition against masked scrambling (52) looks like a promising means of doing that. But again, I think it would be foolish not to explore more subtle possibilities governing visibility. 6.4
Locality in RT
In chapter 3 the LEC was used to explain certain locality e¤ects, and in particular the correlation between locality of operations and other prop-
Superiority and Movement
165
erties of operations. This naturally raises the issue of whether all locality e¤ects can be so derived. In fact, not only scrambling is a¤ected by the locality imposed by the LEC—wh movement is as well. As detailed in chapter 3, wh movement cannot extract from structures that are not embedded until after the level at which wh movement applies, and in fact the islandhood of nonbridge verb complements was cited as an example of that kind of explanation. But if we accept the results of this chapter, there will be some obstacles to reducing all locality to the LEC. Specifically, restrictions on wh movement that fall under the traditional rubrics of Subjacency and the ECP cannot be explained. The Wh Island Constraint, for example, cannot be derived. (59) is a typical Wh Island Constraint violation. (59) *What i do you wonder who bought t i ? Assume the LEC. As attested by the presence of the wh word in its SpecC, the embedded clause is built up to the level of CP at the level at which wh movement is defined—let’s say, SS; but if wh movement is available at SS, there is no timing explanation for the ungrammaticality of (59). If CP is present in the embedded clause, then it is also present, and available for targeting, in the matrix clause. Of course, one could supplement the LEC with more specific ideas about how levels are characterized. For example, one could require that all movement in a level applies before all embedding in a level; then timing would account for the Wh Island Constraint. I am not at all convinced this is worthwhile. To begin with, there are languages that are reported not to have a Wh Island Constraint; this would at least tell us that the stipulation just mentioned was subject to variation, an odd conclusion given its ‘‘architectural’’ flavor. We would especially find ourselves in a bind if we were to accept Rizzi’s (1982) conclusion that Italian has a wh island paradigm like the following: (60) a. *whi . . . [wh . . . [that . . . t i . . . ]] b. whi . . . [that . . . [wh . . . t i . . . ]] That is, extraction from a that clause inside an indirect question is ungrammatical, but extraction from an indirect question inside a that clause is grammatical. Since both wh clauses and that clauses clearly involve CP structure, they are introduced at the same level, and there is no way to make this distinction with timing under the LEC. If it is ‘‘too late’’ to
166
Chapter 6
extract wh in (60a), then it is too late in (60b) as well, and so there is no way to distinguish them. See Rizzi 1982 for examples and for an account of how languages vary with respect to wh-island e¤ects. I will tentatively conclude, then, that wh movement is subject to locality constraints on embedding, beyond those predicted by RT. Importantly, scrambling cannot be subject to constraints beyond those RT imposes. That is because scrambling is not a rule operating within any level, but arises as competing requirements of Shape Conservation are played out. So it is important that scrambling not show any locality conditions that cannot be reduced to the LEC and its e¤ect on timing. From this point of view, the conclusions reached in this chapter about multiple-wh questions are especially significant. Rudin (1988) argues that the primary and secondary wh movements are di¤erent sorts of movement—the di¤erence between wh movement and scrambling, respectively. We would thus expect the primary wh movement to obey Subjacency, and the secondary wh movements to obey only the strictures imposed by the LEC. Richards (1997) documents detailed di¤erences between the primary and secondary wh movements that suggest this distinction might be correct. Interestingly, Richards’s own theory draws no distinction between the movement of wh and the movement of other elements; they are all instances of Move, which has a uniform (if spare) set of properties. Instead, Richards proposes what he calls a ‘‘Subjacency tax’’ theory of how rules are governed by constraints: if several movements target the same functional projection, the first movement obeys Subjacency, but the rest of the movements are free to apply in defiance of Subjacency (the first one having paid the ‘‘Subjacency tax’’). The tax notion exactly distinguishes the first movement from the rest. Consider, for example, the following cases in Bulgarian: (61) a. *Koja kniga i otrecˇe senatoraˇt [maˇlvata cˇe iska da which book denied the-senator the-rumor that wanted to zabrani t i ]? ban da b. ?Koj senator koja kniga i otrecˇe [maˇlvata cˇe iska which senator which book denied the-rumor that wanted to zabrani t i ]? ban (Richards 1997, 240)
Superiority and Movement
167
The single complex-NP extraction of koja kniga in (61a) is ungrammatical because of Subjacency; but in (61b) the same extraction causes only weak unacceptability, because the primary extraction targeting the matrix SpecC (of koj senator) obeys Subjacency. The movement of NP1 ‘‘pays the Subjacency tax’’; NP2 is then free to move in violation of Subjacency, which it in fact does in this example under reasonable assumptions. (See Richards 1997 for the original formulation of this theory and extensive examples.) In the end, then, Richards’s theory delineates approximately the same di¤erence between the primary and secondary wh movements that Rudin (1988) proposed, and that is needed in RT; Richards simply derives that di¤erence from his notion of the Subjacency tax. We have already discussed examples that cast doubt on the view that the two movements are the same kind of movement in the first place: namely, the Bulgarian examples in which a secondary wh word in an embedded clause does not move to its primary counterpart, but nevertheless obligatorily moves within its own clause ((16), repeated here). (62) a.
Ko k tvrdis [da koga t k voli]? who.nom claim.2sg that who.acc love.3sg b. *Ko k tvrdis [da t k voli koga]? (Konapasky 2002, 97)
Such examples suggest that the di¤erence between the primary and the secondary wh words has nothing to do with wh attraction. If that were so, the Subjacency tax theory would be irrelevant, as only a single wh word would ever be moved to SpecC anyway. 6.5
Conclusion
In this chapter I have pursued the notion that scrambling and wh movement are fundamentally di¤erent: wh movement is an intralevel movement rule, and scrambling is simply the misrepresentation of one RT level by the next level. I have argued in particular that in multiple wh movement languages only one wh expression undergoes wh movement, and the rest undergo scrambling, essentially Rudin’s (1988) conclusion. I have argued that assimilating scrambling to wh movement is a mistake, and that in particular the theory proposed by Richards (1997) leaves significant questions unanswered.
168
Chapter 6
After citing problems for the views of others, especially Richards, I think it is only fair to expose a problem with the RT formulation of multiple movement. The problem arises in trying to state precisely what occurs at what levels and to correlate that with conclusions drawn from other languages. For Bulgarian in particular, the problem manifests itself as a conflict between the ordering of scrambling and its locality. Bulgarian wh scrambling is a long-distance phenomenon (at least in the dialects that allow it; see the discussion surrounding (16)), penetrating CPs in particular. (63) Koj profesori koj vapros j t i iska [da kazˇe molitva [predi which professor which question wanted to say prayer before da obsadim t j ]]? that we-discussed ‘Which professor wanted to say a prayer before we discuss which issue?’ (Richards 1997, 109; from R. Izvorski, personal communication) So wh scrambling must occur after CP embedding. At the same time I have supposed that wh scrambling occurs before wh movement, since this explains, in the context of RT, why the independent variable is always exterior. Combining these conclusions with the finding of previous chapters that wh and CP embedding occur in the same level (say, SS) results in the following ‘‘ordering’’ paradox (x > y means ‘ y happens before x’): (64) a. wh movement > wh scrambling b. wh scrambling > CP embedding c. CP embedding ¼ wh movement By one consideration, then, wh scrambling is strictly ordered before wh movement; by another, they occur at the same level, exactly the level at which CP embedding occurs. The only way to dissolve paradoxes is to attack their assumptions until one falls. The easiest one to attack here is the identification of the level at which CP embedding takes place and the level at which wh movement takes place. Ordering either wh scrambling or wh movement before CP embedding is out of the question; in particular, it is incoherent, as it is impossible to extract from something that is not embedded yet. The ordering we need is CP embedding, wh scrambling, wh movement. There is no paradox in this order, so long as there are further levels after the
Superiority and Movement
169
level of CP embedding. We simply don’t have the means to independently identify the other levels. The other possibility would be to develop some means of allowing wh scrambling to occur after wh movement, but in such a way that a secondary (non-D-linked) wh expression could not be scrambled above the primary one. The latter task is daunting because we have seen that such scrambling above the primary wh expression is possible in certain languages: witness the long topicalization of wh words in Japanese, for example, discussed in chapter 5. I will leave the problem unresolved.
This page intentionally left blank
Chapter 7 X-Bar Theory and Clause Structure
Taken together, this chapter and the next provide what I would tentatively call the RT model of phrase structure, inflection, and head-to-head phenomena. Even taken together, they are too ambitious for their length, as they propose a theory of phrase structure that incorporates (i.e., eliminates) both overt and covert head movement, and an account of the morphology/syntax interface (‘‘morphosyntax’’) that presumes to forgo ‘‘readjustment’’ rules. The two chapters are interdependent in that this chapter introduces the definitions of phrasal categories, the mechanisms responsible for agreement and Case assignment, and the relation between these and the inflectional categories marked on the verbal head, and the next chapter proposes a theory about how the inflected verbal head is spelled out. Beyond that, this chapter uses mechanisms that are not fully developed or justified until the next chapter: specifically, the marking of the complement-of relation on category nodes (using the sign ‘‘>’’), the notion of reassociation, and the particular theory of multiple exponence. Before turning to these matters, I would like to outline why I think there is an RT model of phrase structure that is di¤erent from the standard treatment, and to briefly suggest how it is di¤erent. The phenomena explored here are accounted for in the standard model by a combination of X-bar theory and movement governed by the Head Movement Constraint (HMC; Travis 1984). The HMC is commonly understood to be a subcase of Relativized Minimality (Rizzi 1990). Relativized Minimality says that locality conditions are parameterized and that the significant subcases correspond to the A, A, and V (or head) subsystems. But in the past several chapters I have suggested that the A/A distinction should be generalized to, or dissolved into, a more general parameterized distinction (A/A/A/A) defined by the RT levels, and that the locality associated with each of these is determined by the level in which it is defined, in that it is
172
Chapter 7
determined by the size of the structures that are assembled at that level. But now the Relativized Minimality series ‘‘A/A/head’’ becomes awkward. There is no natural place for ‘‘head’’ in the new generalization. This suggests that the locality of head movement needs a separate account, not related to the A/A distinction or its generalization in RT. V cannot be located in any particular level, but in fact occurs in every level; it di¤ers in this respect from the entities A/A/A . . . and so cannot be assimilated to them. In fact, verbs, and heads in general, are independently parameterized by the RT levels. See the more extensive discussion of Relativized Minimality in section 7.3. In the following discussion the sign ‘‘>’’ indicates the complement-of relation. In this chapter and the next, we will see that this relation always holds between two elements, but in fact elements of quite diverse types, including at least the following: (1) a. b. c. d.
a word and a phrase (saw > [the boy]NP ) a morpheme and a morpheme ( pick < ed ) a word and a word (V > V) a feature and a feature (Tense > AgrO )
This is further complicated by the fact that words make up phrases, features make up word labels, and so on, and there must be some relation between the complement-of relations of complex forms and the complement-of relations of their parts. What follows, in this chapter and the next, is a calculus of these relations that seems to me to be the most appropriate for RT. These two chapters flesh out a view of the relation between syntax and morphology that I have put forward in a number of places, particularly in Williams 1981a and 1994a,b and in Di Sciullo and Williams 1987. In those works I viewed the Mirror Principle as arising from the fact that words and phrasal syntax instantiate the same kinds of relations. As argued in Di Sciullo and Williams 1987, the Mirror Principle is nothing more than the compositionality of word formation; that is, [ pick þ -ed ]V as a morphological unit is equivalent to [did pick]VP as a syntactic unit. Both instantiate the complement-of relation between T and V, but one does it in a head-final ‘‘word’’ structure with its properties, and the other in a head-initial ‘‘phrase’’ structure with its own di¤erent properties. These two chapters attempt to provide a more explicit calculus to back up that claim. This view, sometimes called lexicalism, has been confused with another view, one that goes back to Den Besten 1976 and before that to Genera-
X-Bar Theory and Clause Structure
173
tive Semantics, related to ‘‘deep versus surface’’ lexical insertion. The notion ‘‘surface’’ (or ‘‘late’’) lexical insertion of course only makes sense in a derivational theory. But even in a nonderivational theory we can ask what the relation is between the form of the word and the environment it appears in. In earlier work I took the view that the lexicon contains its own laws of formation, sharing some features with, but di¤erent from, the laws of syntax, and that the ‘‘interface’’ between the lexicon and syntax could be narrowed to exactly this: the lexicon produces lexical items with their properties, and syntax determines the distribution of such words solely on the basis of their ‘‘top-level’’ properties, not on the basis of how they came to have those properties during lexical derivation. I think this view is vindicated by the nearly inevitable role that ‘‘lexicalism’’ plays in RT. I think the question of whether insertion is ‘‘late’’ or ‘‘early’’ depends to such an extent on the particular theory in which it is asked that to raise it in the abstract is useless. For example, Generative Semantics and Den Besten’s (1976) theory are quite di¤erent frameworks, so di¤erent that each one’s assumption of ‘‘late insertion’’ can hardly be seen as supporting it in the other. But I do think that the above-mentioned question about ‘‘lexicalism’’ can be fruitfully raised as a general programmatic question. 7.1
Functional Structure
I will propose here an X-bar theory in which a lexical item directly ‘‘lexicalizes’’ a subsequence of the functional hierarchy, where by functional hierarchy I mean the sequence of elements that make up clause structure: T > AgrS > AgrO . . . Aspect > V. In the construction of a clause, the entire functional hierarchy must be lexicalized; however, there is more than one way to accomplish that. For example: (2)
174
Chapter 7
In the theory to be presented, it is not just that was in (2a) bears some relation to the bracketed subsequence T > AgrS ; rather, it is T > AgrS in that ‘‘T > AgrS ’’ is its categorial label. All types of elements— morphemes, words, compounds, phrases—can realize subsequences; and no element can realize anything but a subsequence. In this theory ‘‘lexicalizing a subsequence’’ is not a derived property of lexical items; rather, it is simply what lexical items do. I suppose the biggest mystery about language that these proposals turn on is where the functional hierarchy comes from in the first place—why there is such a thing as Cinque’s (1998) functional hierarchy. Put differently, why is intraclause embedding so fundamentally di¤erent from interclause embedding? The question is acute in all frameworks, but has not been seriously addressed. I am not innocent. The tradition of solving syntactic problems by introducing new fixed levels of internal clause structure includes my own dissertation (Williams 1974), which sought to explain transformational rule ordering by appealing to four levels of internal clause structure (and I secretly thought there were six) and arguing that the subparts of that structure had independent existence (as small clauses), but without asking where the structures came from. To me, the mystery is this: why aren’t all embeddings on a par, each phrase’s properties being determined by its head, or its subparts, in the same way? In such a theory the transition from V to its complement CP would be no di¤erent from the transition from T to AgrS or from AgrS to AgrO . But this simplest state of a¤airs is not what we find, and in acknowledgment of the mysterious distinction I will refer to them as complement embedding and functional embedding, respectively. A related mystery is, why is the internal structure so rigid? Cinque (1998) has identified over a hundred steps from the top to the bottom of a clause. In fact, the clause might not have the purely linear structure Cinque suggests. It at least seems to have some adjunct ‘‘subcycles,’’ as shown here: (3) a. b. c. d.
John let us down every month every other year every decade . . . John let us down on every planet in every galaxy . . . [[[VP] XP] XP] . . . ?John let us down because Mary was there because he was sick ...
Recursion of time and place is possible, as schematized in (3c), but perhaps not of causes. The obvious nesting of meanings in such subcycles
X-Bar Theory and Clause Structure
175
suggests that the whole structure itself might be explicated in terms of meaning, but nothing substantive has been forthcoming, and I have nothing to add myself. At any rate, what follows is a theory of what the syntax of expressions that express a single functional hierarchy can look like, and it executes the idea that all items, including all lexical items, lexicalize (or realize) subsequences of the functional hierarchy. As interesting as I think the consequences are, I must warn in advance that my proposals do not address this mystery of why complex phrases with fixed functional structure exist in the first place. I hope that whatever the solution to this mystery turns out to be, it will be compatible with what follows, and so I will take the existence of the functional sequence and its linear structure as axiomatic. 7.2
An Axiomatization of X-Bar Theory
Consider the complement embedding of the direct object NP under V. Full NPs (or DPs, whichever turns out to be right) may not exist until SS; at least, some of their components, such as relative clauses, do not exist until then. Nevertheless, TS contains a ‘‘primitive’’ version of an SS NP, CS a more developed version, and so on; and these are in correspondence with one another under Shape Conservation. (4) a. TS: [amalgamatev holdingsnp ]vp ‘ b. CS: [amalgamateV [his holdings]NPacc ] (The introduction of adjuncts, in this case his, will be taken up later.) To propose a term, there is a shadow of the CS NP his holdings in TS, and that shadow is its correspondent under representation: the np holdings in TS. Functional embedding, on the other hand, introduces material into a tree at a later level that has no shadow or correspondent in TS. Suppose, for the purpose of exposition, that T(ense) is introduced in SS; then the surface structure in (5a) will represent the theta structure in (5b). (5) a. CS: [amalgamateV [his holdings]NPacc ]VP ‘ b. SS: [amalgamate[T>V] [his holdings]NPacc ][T>V]P T in SS clearly has no correspondent in CS. T is not an independent node in the surface structure or any other structure; rather, it is a feature that has been applied to the projection of
176
Chapter 7
V. (Shortly I will explain how such structures arise and what expressions of the form [x > y] mean.) Functional embedding can also introduce a lexical head, like complement embedding. In this version, auxiliaries realize functional elements. (6) a. CS: [amalgamate [his holdings]NPacc ] ‘ b. SS: [willT [amalgamate [his holdings]NPacc ]VP ]TP Complement embedding, on the other hand, has no analogue of (5b); it is always done by explicit subordination to an overt head. That is to say, the main verb is never realized as an a‰x on its direct object. This is because the construct consisting of the a‰x plus the direct object would have to have a label, and that label would violate the second axiom in the formalism I will provide shortly. Important for the present discussion is that in neither style of functional embedding (feature or full word) does a shadow of the embedding element (T in (5), will in (6)) appear in TS. In this section I want to develop the rationale for the distinction between complement embedding (4) and functional embedding (5). It is central to the way in which RT and minimalist practice di¤er, and the distinctive consequences of RT stem from it. The discussion culminates in an axiomatization of X-bar theory. Although complement and functional embedding di¤er in the fundamental way just mentioned, they both are compatible with the principle of Shape Conservation, which holds of the successive members of a derivation regardless of what kind of embedding is involved. Consider the mapping in (4); it puts in correspondence the elements in the theta structure and the Case structure, and also their relations, in the following sense. First, the TS ‘‘verb’’ amalgamate and the CS ‘‘verb’’ amalgamate are in what we might call ‘‘lexical’’ correspondence; that is, these are two faces of the same lexical item. A lexical item has traditionally been understood as a collection of di¤erent forms of the same thing; the usual list of the forms includes syntactic form, phonological form, and semantic form. I would expand that list to include all of the RT levels, but the idea is the same: a lexical item is the coordination of its contributions to all the levels it participates in. Thus, for lexical items the representational mapping conserves the relation ‘‘x is a ‘face’ of the lexical item y.’’ Second, the mapping also conserves the complement-of relation: in TS holdings is in a theta relation to amalgamate, and in CS it is in a Case relation, but since these are the complement relations of the two
X-Bar Theory and Clause Structure
177
respective representations, the correspondence is again conservative. And third, the head-of relation is conserved: heads are mapped into heads. We can formalize this conservation somewhat in terms of the notion commutation. If a relation is preserved, we can say that it commutes with the representation relation. For example, we will say that the head-of relation commutes with the representation relation, in that the following relation will always hold: (7) The head of the representation of X ¼ the representation of the head of X. Schematically: (8) [amalgamatev holdingsnp ]vp b representation
lhead-of
[amalgamateV [his holdings]NPacc ] lhead-of
amalgamate b representation amalgamate
Construing representation this way allows a more abstract characterization of what is conserved than simply geometrical congruence, though geometrical congruences will certainly be entailed by it. I have spoken sometimes of a subpart of a structure at one level as the ‘‘correspondent’’ or ‘‘shadow’’ or ‘‘image’’ of a subpart of a structure at a di¤erent level. The shape-conserving mapping between levels warrants such locutions. The shape-conserving mapping is defined as a mapping of one whole level (i.e., set of structures) onto another whole level. As a part of that process, it derivatively maps individual structures at one level to individual structures at another level. Further, if it conserves the part-of relation, then it will map parts of individual structures at one level to parts of individual structures at the other level. In (8), for example, holdings is a part of the TS : VP amalgamate holdings. That TS : VP is mapped to an XS : VP amalgamate his holdings; in virtue of that mapping, and its conservation of the part-of relation, TS : holdings is also mapped to XS : his holdings. That is, TS : holdings is the TS correspondent, or shadow, of XS : his holdings under the shape-conserving mapping. If the mapping were not shape conserving, then it would be impossible to know what the correspondents of an XS : phrase were. And in fact because the mapping allows deviations and therefore is not fully shape conserving and also because new elements enter at each level, problems may well arise in some cases in determining what is the correspondent of what. But in general there is a coherent and obvious notion of the earlier correspondents of a phrase.
178
Chapter 7
Complement embedding always involves one lexical freestanding head embedding a phrase as its complement. But, as already mentioned, functional embedding is often, though not always, signified not by a lexical item, but simply by a feature on the head, as in (5), where the feature controls some aspect of the morphology of the head. We might call this kind of embedding a‰xal embedding, since its sign is usually an a‰x (perhaps silent) on the head. As with complement embedding, we want to understand how this functional embedding relation conforms to Shape Conservation. I will consider two ways of making sense of this kind of embedding. In standard minimalist practice, stemming from Travis 1984, a‰xal embedding is accomplished by head-to-head movement, wherein the main verb is generated in a phrase subordinate to the a‰x (or its featural composition) and then moves to the a‰x. The successive movements of the verb account for the Mirror Principle, since if the movement is always, for example, left adjunction, the order in which the a‰xes (now, su‰xes) appear on the verb will correspond to their hierarchical embedding in the structure that the verb moved through. (9) [[[[V þ af1 ] þ af2 ] þ af3 ] . . . [ . . . tVþaf 1 þaf 2 . . . [ . . . tVþaf 1 . . . [ . . . tV ]V ]F1 P ]F2 P ]F3 P (where afi bears features Fi ) Such a proposal accounts for the Mirror Principle by building morphological a‰xation directly into the syntactic derivation in a particular way. Although such a view still has explicit adherents (see, e.g., Cinque 1998), most researchers have retreated from this strongly antilexicalist view. Unfortunately, the retreat has usually involved a weakening of the explanation of Mirror Principle e¤ects. The account I will present here will divide the problem into two parts: first, how does X-bar theory regulate the information on phrase labels? And second, how does morphology realize those labels when they occur on terminal nodes? Despite being more complicated than the hybrid Cinque-style theory in having two separate components, phrasal and morphological, it succeeds in capturing the full Mirror Principle e¤ects, because it involves nothing but X-bar theory and direct morphological realization. In other words, there is nothing more relating the two—no readjustment rules, no movements of any kind, and therefore no locality conditions of any kind, just the calculus itself. The practical problem with
X-Bar Theory and Clause Structure
179
locality conditions and readjustment rules is that they lead to ‘‘instantly revisable’’ theories. It therefore seems to me that the strongest possible theory lies down the path that begins by separating phrasal syntax and lexical syntax from one another. In the place of head-to-head movement for phrasal syntax, I propose that a feature can directly take a phrase as a functional complement, and when it does, the feature is realized on the head of the phrase by the interaction of Shape Conservation and the definition of head. If we want to embed a phrase under a full lexical item H, the most obvious and simplest way to do so is by concatenation: that is, concatenate the head and the phrase, and name the resulting phrase after the head. (10) Lexical Head þ Some Phrase ¼ [Lexical Head Some Phrase]LHP But suppose that instead we want to subordinate a phrase to a feature, as in (11). The simplest way is to add the feature to the feature complex of the phrase itself. Then the feature will ‘‘percolate’’ to the head. The percolation is in fact forced by representation—in particular, by the commutation of representation and the head-of relation.
If the feature did not percolate (downward) in SS, then the SS : VP would not count as having a head (since its feature composition would be different from that of its V), and this would break the commutation diagram. Note that the representation relation is not symmetric; the surface structure has ‘‘more information’’ than the Case structure. This reflects the general asymmetry of the representation relation already discussed and does not alter the conclusion about percolation. The notation X > Y used in (11) and elsewhere indicates the complement-of relation; it means ‘X takes Y as a complement’. For example, T > V is what results from adding T to the featural complex I have abbreviated by V. In other words, the label itself is structured by the complement-of relation. This is meant as an alternative to the usual notion that a node is a set of features, with no order or relation among them. In the account I am proposing here, nodes are features in ‘‘complement chains’’ of the kind that can be symbolized as A > B > C > D > E.
180
Chapter 7
(See chapter 8 for more on this, and for some theorems about the latent descriptive power of this notation.) The notation gives structure to the set of features. That structure makes possible a simple axiomatization of X-bar theory, at least insofar as X-bar theory concerns the well-formedness of phrase labels in trees— instantiating in particular the head-of relation and a feature percolation mechanism. Below are two trees that will fall under the axiomatization. (12a) is an example of a simple clause with a single main verb. (12b) is an example of a clause with an auxiliary verb and a main verb. The axioms to be discussed will be illustrated with respect to (12a). (12)
(12a) illustrates two properties we want the system to have. First, each node is structured with respect to the complement-of relation. Second,
X-Bar Theory and Clause Structure
181
there are only three relations that can hold between a mother node and a daughter node: (13) Axiom 1 (The juncture types of X-bar theory) There are just three juncture types: a. mother node ¼ X > daughter node (embedding) b. daughter node ¼ mother node > X (satisfaction) c. mother node ¼ daughter node (adjunction) Case (13a) licenses embedding the daughter node under X at the mother node. For example, in tree (12a) T embeds AgrS at the top. Case (13b) licenses the ‘‘satisfaction’’ of features under agreement; for example, in tree (12a) AgrO is discharged or ‘‘checked’’ by the direct object, as illustrated by the relation of [T > AgrS ] to its daughter [T > AgrS > AgrO ]. Finally, case (13c) licenses adjunction structures, where mother and daughter nodes are identical; this is illustrated in tree (12a) by the two Adv nodes. On this account agreement is strictly local. The AgrS feature percolates as far as it likes, except of course that it can be ‘‘checked’’ only by a feature, and only when it is peripheral in the label, and it must be checked by a sister to the label. I think Axiom 1 is in fact X-bar theory itself. It tells what form succession of heads must take, thereby defining the notion ‘‘head’’; and at the same time it defines the permissible percolations of features. But the structures found in natural language are also defined by that previously discussed mysterious condition on functional sequences, which I will call the Pollock-Cinque functional hierarchy (PCFH). (14) Axiom 2 (PCFH) There is a universal set of elements (T, AgrO , AgrS , . . . , V) that are in a fixed chain of complement-of relations: (T > AgrS > > AgrO > V) Labels must be subsequences of this hierarchy. Labels in trees must conform to both Axiom 1 and Axiom 2. The structures admitted by Axiom 1 filtered by Axiom 2 turn out to be just the right structures; that is, they turn out to be structures like (12a), and most other possibilities are left out. Axiom 2 guarantees that a lexical item, or in fact any element, whether simple or derived, must lexicalize a subsequence of functional structure.
182
Chapter 7
The ‘‘operations’’ of embedding and satisfaction defined in Axiom 1 preserve the subsequence property: they add to labels and remove from them only at the ends, so that going up the projection, the labels on the heads will vary smoothly from V to C, never departing from subsequencehood. Axiom 1 in RT accomplishes much of what is done by (covert) verb movement (or head-to-head movement) in standard minimalist practice. As mentioned earlier, head-to-head movement captures at least part of the mirror relation that exists between word structure and phrase structure. Axiom 1 accomplishes the same work by limiting successions between hierarchically adjacent nodes to pairs that di¤er only at the top (13a) or the bottom (13b) of the label; this has the provable e¤ect that the label on the very lowest node bears a mirror relation to the succession of phrases that dominates it. The fact that every pair of labels must meet the stringent conditions of (13) is comparable to the restriction, in minimalist practice, that head-to-head movement is extremely local. The admissibility of adjunct junctures (13c), which leave the label unchanged, corresponds, in minimalist practice, to the feature of Relativized Minimality that makes certain adjuncts invisible to head-to-head movement. Axioms 1 and 2 account for what the syntactic structures can look like, but do not say how labels are spelled out. The spell-out of labels is the topic of chapter 8, where the morphological interpretation of ‘‘>’’ is taken up. The role of X-bar theory in RT, then, can be summarized as follows. There is a relation ‘‘head-of’’ that holds in all levels and participates in the correspondences between the levels under Shape Conservation. There are two kinds of embedding: functional embedding and complement embedding. Functional embedding occurs between levels, because some elements (e.g., T) are simply not defined for levels earlier than PS in the version of RT given in chapter 4, for example. A functional element may be introduced as a free lexical item or as a feature. In either case it subordinates another phrase; if it is a feature, it is added to the label of the phrase it subordinates, and that added feature propagates down to the head to preserve the head-of relations that the structure enters into. With this more specific understanding of how X-bar theory operates in the architecture of RT, I would like to return to the discussion in chapter 3 in which I suggested that the embedding of di¤erent kinds of Ss occurs at di¤erent levels, and that locality and clause union e¤ects of the array of small clause embedding types can be made to follow from that arrangement. The specific problem I want to address is that I risk inconsistency
X-Bar Theory and Clause Structure
183
with that earlier conclusion if I now say that NPs have shadows in TS, but Ss (at least some—for example, tensed Ss) do not. Furthermore, if Ss do have shadows in early structure comparable to the ones that NPs have, then the LRT correlations laid out in chapters 3–5 are jeopardized in a way I will explain shortly. The main reason for saying that the head of an NP appears as a shadow in TS is for selection, and we do find tight selection between the verb and the head noun of the direct object. On the other hand, there is no selection whatever between the matrix verb and the verb of a that clause. This is ordinarily understood as resulting from the fact that that is the head of the clause, and the matrix verb selects the head. Although this answers the point about clauses, it raises a problem for the DP theory of NPs (e.g., Abney 1987): if D is really the head of the direct object, then it is hard to see why there is selection between V and the N beneath D. In fact, the di¤erence between NP and S is even more extreme: the main verb does not even select the tense, or for that matter the finiteness, of the embedded complement. Grimshaw (1978) made this point clearly when she showed that when a verb selects wh, it cannot select even the T value of the IP beneath wh (much less its main verb); consequently, any verb that selects wh automatically and inevitably takes both finite and infinitive complements. the bird sings (15) I know why . to sing This means that the apparent selection for T shown by most verbs must be mediated. he left (16) I know that . *to leave That is, know selects that and that selects [þfinite]. Some di‰culties arise on this view; for example, some predicates seem to be able to determine the subjunctivity of that clauses. (17) a. It is important that he be here. b. *It is known that he be here. In addition, sequence-of-tense phenomena, although not involving selection by the main verb, do show that that is not absolutely opaque, since they link main and embedded T specifications. Despite these problems I will assume that Grimshaw’s conclusion is essentially correct, and that verbs select only for C.
184
Chapter 7
It is a lexical fact about the complementizer that in English that it, unlike wh, is restricted to finite complements. Overt complementizers in other languages are not so restricted, taking both finite and infinitival complements; and in fact English whether does so as well. to go (18) I don0 t know whether . he went In RT the di¤erence in the behavior of NPs and Ss with respect to selection will follow from the fact that NP will have its head N as its shadow in TS, whereas a that clause will have only that, the selected head, as its shadow in TS. This is exactly what we would expect if that were the head of the that clause and N were the head of the NP. On this view what is special about that is that its complement (TP) is not defined until SS, because T itself is defined only at SS. This conclusion gives up the DP hypothesis of NPs, but for a good reason: the obvious di¤erence in selection between NPs and Ss. Proponents of the DP hypothesis (Abney (1987), and others) have taken pains to develop mechanisms and definitions that permit selection between the matrix verb and the N head of NP (inside DP), but not in a way that draws any distinction between NP and S. As a result, the hypothesis suggests that the same selection will be found with Ss; but it is not—selection by the main verb stops with that for CP clauses. Moreover, while that is selected by verbs, as suggested by the CP hypothesis for clauses, the D of a DP is never selected by verbs; if a verb takes a DP at all, then it takes the full range of determiners, with some completely explainable exceptions. In RT, then, there is a fundamental di¤erence between the embedding of NPs and the embedding of Ss: an NP complement is embedded in TS, and in all subsequent levels; S embedding, on the other hand, is distributed across the RT levels depending on what kind of S is being embedded. Baker (1996) o¤ers some evidence for treating NP and S embedding in sharply di¤erent ways. He shows that in polysynthetic languages NP arguments do not occupy theta or Case-licensing positions; rather, what appear to be the expression of NP arguments are actually adjuncts. The arguments involve standard binding-theoretic tests for constituency. S complements, on the other hand, are embedded as arguments exactly as they are in English; again, standard binding-theoretic arguments involving c-command lead inevitably to this conclusion. Actually, there is a version of RT that o¤ers the possibility of having it both ways. In this version NPs could be N-headed in TS and D-headed in
X-Bar Theory and Clause Structure
185
SS, and clauses would be that-headed in both levels, thereby preserving their di¤erent selectional behavior. I will not pursue this possibility here, because it threatens to undermine the LRT correlations of chapter 3, in the following way. If correspondence across levels does not respect categories, as the NP ‘ DP correspondence would not, then possibilities arise that defeat the LRT correlations. Suppose, for example, that an IP is embedded beneath V at an early level (somewhere before SS—say, PS) for ECM, raising, and obligatory control constructions, as suggested in chapter 3. Various clause union e¤ects that depend on the absence of CP structure (e.g., obligatory control) could take place there; then the IP could ‘‘grow’’ a CP through correspondence with an SS structure; in other words, an SS : CP would be put in correspondence with the PS : IP. The SS : SpecC could then be the target of wh movement. We would then have derived obligatory control across a filled SpecC, exactly contrary to the prediction outlined in chapter 3. The following illustrates the derivation just described, with (19a) ‘ (19b): (19) a. PS: NPi [V [NPi . . . ]IP ] (obligatory control established) b. SS: NPi [V [wh [NPi . . . ]IP ]CP ] (obligatory control preserved, wh movement) The straightforward way to avoid this defeat of the LRT correlations is to prevent correspondence under Shape Conservation where the categories are not homogeneous: [[ . . . ]IP . . . ]CP cannot be a representation of [ . . . ]IP . The only ‘‘growth’’ that is allowed is growth that preserves the category, essentially adjunction. This would be a feature of the Shape Conservation algorithm, which unfortunately is still under development. But if this feature survives further investigation, then NP cannot become DP under shape-conserving ‘‘correspondence.’’ We could still maintain that a TS : NP could be embedded in SS under a D. However, since it would not have had any previous communication with the V that the DP is embedded under, we would need, as Abney did, to make D transparent to selection so that V could directly see NP beneath D in the structure. (20) [V [ [ ]NP ]DP ] 7.3
Relativized Minimality
One must pause soberly before putting aside one of the most fruitful ideas of modern linguistics, but if the theory I have developed thus far is taken seriously, Relativized Minimality must be seen as a pseudogeneralization.
186
Chapter 7
Although there is some correspondence between head movement and the mechanisms proposed here, a close examination of the context in which head movement operates reveals decisive di¤erences. The real content of a theory with head-to-head movement lies in the constraints limiting the movement, as otherwise the theory says, ‘‘Anything can move anywhere.’’ The best candidate for the theory of the bound on head movement is the Head Movement Constraint (Travis 1984), and in particular, the generalization called Relativized Minimality (Culicover and Wilkins 1984; Rizzi 1990). The main problem with Relativized Minimality in the context of RT was stated earlier in this chapter. The generalization of the A/A distinction to the A/A/A . . . distinction and the rationalization of the properties of each type in terms of its association with a level under the LEC leaves no room for heads, as heads themselves have no privileged relation to any of the levels, occurring in all of them. But if head movement is not covered under anything like Relativized Minimality, then some other account of the localities it exhibits must be sought. Other considerations point in the same direction. For the A/A/A . . . series, Relativized Minimality is weak compared with the locality that follows from the RT architecture in that it permits rule interactions that cannot arise in RT. For example, Relativized Minimality permits head movement over SpecCP, to the matrix verb. (21) a. [V þ C [wh tC IP]] b. *I wonder-that [who tthat Bill saw t] The reason is that according to Relativized Minimality, di¤erent systems —head, A, A—do not interfere with each other; they only self-interfere, in the sense that a movement of type X will be bounded only by occurrences of targets of type X. But (21) is not possible in RT with the LEC. It remains to find out whether languages instantiate the type of structure illustrated in (21), but the prediction is clear. Likewise for cases in which an A movement bridges an A specifier—the latter includes what has been called superraising, as discussed in section 3.1, and is again not possible in RT on principled grounds. Another di¤erence between head movement governed by Relativized Minimality and the account of inflection and agreement suggested here lies in the di¤erent ways that one can ‘‘cheat’’ in the two theories. I say ‘‘cheat in,’’ but I should probably say ‘‘extend’’: di¤erent theories allow di¤erent sorts of ‘‘natural’’ extension, and I think a theory should be evaluated on the basis of whether its natural extensions would be welcome or not.
X-Bar Theory and Clause Structure
187
There are two obvious ways to cheat with head movement in Relativized Minimality, and in fact both have been exploited, or should I say, explored. One is to extend the number of self-interfering systems, to four (or more); in the limit, the theory reduces to the null theory, as in the limit every element belongs to a di¤erent category from every other element, and so nothing interferes with anything. The other obvious way to cheat is to sidestep the locality condition by chaining a number of little moves together into one large move; in the context of head movement this is called excorporation. Both are standard. For example, in Serbo-Croatian we find the following evidence of verb clustering: (22) a.
Zaspali bejahu. slept.prt aux.3pl ‘They had slept.’ b. *Zaspali [Marko i Petar] bejahu. c. [Marko i Petar] bejahu zaspali. (Konapasky 2002, 233)
The auxiliary and the participle cannot be separated, suggesting that they form a tight cluster, one naturally seen as arising from head movement of the lexical verb to the participle. But when there are two participles (byl and koupil in the following related Czech examples), the second seems able to hop over the first. (23) a.
Tehdy bych byl koupil knihy. then aux.1sg was.prt bought.prt books.acc ‘Then I would have bought books.’ b. Byl bych tbyl koupil knihy. c. *Koupil bych byl tkoupil knihy. (Konapasky 2002, 233)
This appears to be a case of ‘‘long head movement.’’ There are two ways to extend Relativized Minimality to accommodate this phenomenon. First, we might increase the number of self-interfering categories, the course taken by Rivero (1991), who proposes to account for the pattern in (23) by saying that bych is functional, while byl and koupil are lexical: in (23b) lexical grammatically hops over functional, whereas in (23c) lexical ungrammatically hops over lexical. Second, we might say that clustering takes place in the usual way, but then excorporation accounts for the possibility of (23b) (and details about how it operates account for (23c));
188
Chapter 7
this is the course taken by Bosˇkovic´ (1999). See Konapasky 2002 for summary discussion and critique. In the theory presented here, where there is no head movement, and no Relativized Minimality, these extensions are not available. The inadmissibility of excorporation will follow from theorems about reassociation given in chapter 8. And separating heads into two types will make no di¤erence to the system under discussion here, so long as both types are part of the calculus of complement taking. But in fact the X-bar theory proposed here has its own way to accommodate such facts. I will postpone my own analysis of the Serbo-Croatian paradigm until chapter 8, where I claim that verb clustering follows from a narrow theory of label spell-out. For the time being I simply want to emphasize that the types of solutions or extensions available to RT are very di¤erent, lacking as it does head movement and its governing theory of locality, Relativized Minimality. 7.4
Clause Structure and Head-to-Head Movement
The sort of X-bar theory sketched in the previous section will permit a full account of head-to-head movement e¤ects without movement, locality conditions, or readjustment rules. 7.4.1 Reassociation and Case-Preposition Duality There is a functional equivalence between Case marking and prepositions, long recognized but not formalized; for example, there is some equivalence between to NP and NPdat . This is not to say that the two are interchangeable, just that there seem to be two ways to ‘‘mark’’ an NP, or two ways to ‘‘embed’’ an NP under a Case/preposition (see Williams 1994b). Suppose that P stands for some Case/preposition; suppose further that the relation between the Case marking/preposition and the NP it is attached to is one of embedding. I will leave open whether it is functional embedding or complement embedding (it can probably be either, depending on the preposition), and I will use the symbol ‘‘>’’ already introduced to indicate the embedding relation. Then the equivalence we are discussing is this: (24) [P > NP]P @ [[ . . . [P > N] . . . ]P>N ]P On the left P governs the full NP and projects a P node; on the right P > N is realized on the head noun (as, for example, [dat > N], which is
X-Bar Theory and Clause Structure
189
more traditionally notated as Ndat ). In both cases P subordinates N. I will call this relation Case-preposition (C-P) duality, even though it will turn out to be a broader relation. In what way are these two structures equivalent? And in what way are they di¤erent? They are obviously not identical, in that in a given construction in a given language with a given meaning, only one of them can be used; so English has only to boys, whereas Latin has only pueribus. Nevertheless, the two expressions are alike in two ways: first, the relation between P/dat and the N/NP is approximately the same in the two cases, and second, the distribution of P > NP and [P > N]P is approximately the same (i.e., they fulfill the same function, that of expressing the dative argument of a verb). But what is the basis of their equivalence? Why should they be alike at all? We might regard the two representations in (24) as mutually derivable by abstraction/conversion (‘‘)’’ signals abstraction, and ‘‘(’’ conversion). (25) [ . . . [X > Y] . . . ]X , [X > . . . [Y] . . . ] I will aim to derive the equivalence from the X-bar theory developed here without any independent operations, but I will nevertheless refer to these operations in exposition. Specifically, I will explore the possibility that C-P duality is nothing other than the relation called reassociation in chapter 8. There, reassociation is shown to be a property of the complement-of relation in the morphology of the functional system, so that, for example, if the left-hand side is a valid expression, then so is the right-hand side, and vice versa. (26) [[X > Y] > Z] , [X > [Y > Z]] The relation accounts for, among other things, a kind of ‘‘clumping’’ in how functional elements are realized morphologically. Given that functional elements are strictly ordered (T > AgrS > AgrO > V), one would expect only right- (or perhaps left-) linear structures to realize the inflected verb; however, morphological structures like these are found as realizations of this order: (27) Swahili inflected verb [a-li] [ki-soma] AgrS -past AgrO -V [AgrS > T] > [AgrO > V] , [AgrS > T > AgrO > V] (Barrett-Keach 1986, 559)
190
Chapter 7
The structure is a symmetrical binary tree, rather than a right-branching one (see chapter 8 for details; see also Barrett-Keach 1986). Why can the symmetrical structure on the left realize the linear chain of functional elements on the right? The bottom line of (27) shows that the actual structure of the inflected verb enters into a ‘‘C-P-like’’ duality relation with the functional structure it is supposed to represent. See chapter 8 for further discussion. Reassociation is the relation illustrated here: (28) [A > B] > C , A > [B > C] C-P duality of the sort instantiated in (25) can be viewed as a straightforward case of reassociation if we can appeal to a null element 0 to serve as the third term in the reassociation. (29) [[0 > X] > Y] , [0 > [X > Y]] But the extension to 0 suggests an even more exotic possibility. In (25) X is abstracted out, leaving behind Y; but suppose even more were abstracted out, namely, X > Y. (30) a. [[ . . . [X > Y] . . . ]X>Y ]X ) [[X > Y] > [ . . . [0] . . . ]X>Y ]X b. [0 > [X > Y > 0]] ) [[0 > X] > [Y > 0]] ) [[0 > X > Y] > 0] In terms of reassociation, (30a) is simply a double application of the operation Reassociate, as indicated in (30b). For X ¼ P and Y ¼ N we would then have: (31) [[ . . . [P > N] . . . ]P>N ]P ) [[P > N] > [ . . . [0] . . . ]P>N ]P This essentially evacuates the head position of the complement entirely— not just the P/Case, but the N as well. This suggests that the head of the noun could be realized on the preposition itself. And this possibility arises purely through X-bar theory, with no further mechanisms, so long as the theory includes C-P duality, as it arises if reassociation holds of X-bar syntax. Such cases resemble the ‘‘inflected prepositions’’ found in Breton, where an agreement mark on a preposition precludes overt expression of its direct object (Anderson 1982). Examples like (31) will be well formed only if the label [P > N] satisfies Axiom 2; that is, P must be higher than N on the relevant functional hierarchy. There are perhaps two kinds of prepositions: one ‘‘functional’’ and transparent, for which Axiom 2 would be satisfied; and another that takes ‘‘true’’ complements, for which it would not be satisfied (see Williams 1994b for extended discussion). Structures that instantiate the right-
X-Bar Theory and Clause Structure
191
hand side of (30) might be the coalescences of P and pronoun or article found in some languages. (32) a. zu dem ) zum (German) to the.dat b. a` le chien ) au chien (French) to the dog Taking (32b), and assuming the DP hypothesis, we have: (33) P > [D > NP]DP ) [P > D] [0 > NP]NP In this way, head-to-head movement, in its instantiation as a kind of ‘‘incorporation,’’ is realized directly by X-bar theory. Used in this way, Reassociate bears an obvious relation to covert verb movement. However, of course it is not movement, and it need not be bounded by any extrinsic locality conditions; rather, it is localized by the X-bar formalism itself. In the remainder of this section I will explore the possibility that C-P duality, as an instantiated application of Reassociate in syntax, is the appropriate syntax for overt verb movement as well. Given the conclusions of the last two sections, this is an almost obligatory step to take. In section 7.3 I suggested problems with Relativized Minimality as an account of head-to-head relations, partly because in section 7.2 I developed an alternative account of inflection in syntax. But since in the standard account of clause structure Relativized Minimality is the principle governing the locality of overt head movement, something must be developed in its stead if it is to be eliminated on general grounds. Verb-second (or subject-aux inversion (SAI) in English) can be seen as arising from C-P duality in the following way. A declarative clause is a tensed entity, where the tense is realized on the head of VP (also the head of NP, if nominative is simply the realization of tense on N, as suggested in Williams 1994b). (34) [NP VPT ]T By C-P duality, this is the same as (35). (35) [T [NP VP]] (35) itself is not instantiated, because there are no lexical items that purely instantiate T (unless do is one). But if V in the tensed clause is represented as T > V, then the indicative clause instead looks like (36), (36) [NP [T > V]P]T
192
Chapter 7
which, by (radical) C-P duality, is the same as (37). (37) [[T > V] [NP 0P]]T , [NP [T > V]P]T Thus, auxiliary inversion structures (the left-hand side of (37)) arise from uninverted structures through C-P duality. Duality captures the most essential properties of inversion: it is local (only the top tensed verb can move, and only within a single clausal structure), and it is to the left (like P). Duality does not capture the fact that inversion is restricted to modal verbs in English, but not in German, a point to which I will return. The restriction to movement within a single clausal structure follows from Axiom 2, which says that all labels must be substrings of the PCFH, just as [P > D] in (33) is. T > V either is, or abbreviates, one such substring, but to move into a higher clause it would require a label like [T > C > T ], which violates the PCFH. The reason ‘‘movement’’ appears to displace items to the left follows from the fact that T (or for that matter V) takes its complements to the right. Thus far I have assumed that the subject is an adjunct or specifier of the VP and hence does not participate in the duality. In fact, though, the subject is treated somewhat di¤erently in the two constructions related by the duality. In the V-initial structure, the subject is treated more as a direct object than as a subject, in that adverbs cannot intervene between it and [T > V]. (38) a. *Did recently John leave? b. John recently did leave. c. *John saw recently Bill. In this, the fronted auxiliary is playing the same role that the preposition plays in certain absolutive constructions. (39) a. With John recently departed, . . . b. *With recently John departed, . . . In fact, the absolutive construction is a good model for SAI, and it emphasizes the notion that SAI arises from C-P duality. Some absolutive constructions—for example, the Latin ablative absolute—even use Case instead of P, thus confirming the connection. (40) Caesare vivo, . . . Caesar.abl living.abl ‘With Caesar living, . . .’
X-Bar Theory and Clause Structure
193
That is, the Latin absolutive bears the same relation to the English absolutive that an uninverted clause bears to an inverted clause, the relation of C-P duality. Another model is the consider construction (consider Bill foolish) and similar small clause constructions, in which the V relates to the NP as it would to a direct object. In English, we can say that the inverted auxiliary relates to the subject in its ‘‘derived’’ position; this is because the invertible verbs are all auxiliaries, and auxiliaries only take subject arguments. But in other inversion constructions this is impossible. In German, for example, the class of invertible verbs includes all predicates, and so the verb must govern aspects of the clause structure that should not be accessible to it from its derived position. For instance: (41) Gestern kaufteV Hans das Buch tV . The derived position of the verb should not permit a theta relation to the direct object, because of the locality of theta relations. Rather, the ‘‘trace’’ of the verb should be responsible for that relation, as it is in standard accounts. How can this be done with C-P duality? C-P duality can give us a kind of trace, if we take the 0 of reassociation seriously. I have used representations in which XPs have 0 heads without saying how they are licensed, and they clearly do not occur freely. In the following structure, (42) [X > 0P] [X > [0 . . . ]0P ]XP X ‘‘controls’’ 0 by virtue of governing 0P. It controls it in the sense that the 0P acts, in its interior, as though X occupied its head position. It might not be far-fetched to regard the 0 as an ‘‘anaphor,’’ with X as its antecedent. This in fact makes it just like movement: the 0 head of 0P is identified with X. But in this case, the antecedence arises from Reassociate and the complete evacuation of the label on the head when Reassociate applies in the most radical manner. Control of a 0 head is also found in gapping. (43) [John saw Mary][T]P and [Bill 0 Pete][0]P . In the simplest interpretation the [T] label on the first conjunct serves as the antecedent of the [0] label on the second conjunct, and by virtue of C-P duality governs the interior of the second conjunct. This explains the locality of the construction: it cannot occur in coordinated CPs, because the antecedence holds only for immediate conjuncts.
194
Chapter 7
(44) *I think [that John saw Mary] and [that Bill 0 Pete]. In (43) more than just T is deleted in the second conjunct; the verb is also deleted, and the verb is understood as identical to the verb of the first conjunct. In Williams 1997 I suggested that a 0 head always licenses a 0 complement if the following relation holds: (45) Antecedent of complement of 0 head ¼ complement of antecedent of 0 head. That is, antecedence and complementation commute. See Williams 1997 for a derivation of (45) from a more general principle and for a discussion of its scope and properties. Returning to SAI, the notion that the 0 head of 0P is anteceded by whatever governs 0P explains why the absent head nevertheless manages to govern the internal structure of the 0P that it heads. For example, fronted auxiliaries are compatible only with whatever complements are possible when fronting does not take place: (46) a. Can John [0 swim]0P ? b. *Can John [0 swimming]0P ? c. John can swim. d. *John can swimming. Questions about the type of complement of the 0 head are deferred to its antecedent, the fronted auxiliary. A. Neeleman (personal communication) suggests that traces might not in fact be necessary—at least one of the forms in the dual relation will represent a structure at a previous level in which the relevant licensing takes place. So, for example, (46a) is the dual of (46c), and (46c) itself or some structure that it represents licenses the relation between can and the present participle; in that case the trace in (46a) is not needed for licensing. 7.4.2 Multiple Exponence in Syntax The mechanism I have called Reassociate also provides a means of accounting for multiple exponence in syntax. Multiple exponence is an embarrassment for the theory of labels proposed here; it shouldn’t exist. The reason is, given a functional hierarchy F1 > > F13 as in (47a), where each Fi is subcategorized for Fiþ1 , applying Reassociate will derive objects like those in (47b), but not like those in (47c) or (47d). (M marks subunits that correspond to morphemes.)
X-Bar Theory and Clause Structure
195
(47) a. F1 > F2 > F3 > F4 > F5 > F6 > F7 > F8 > F9 > F10 > F11 > F12 > F13 b. [F1 > F2 > F3 ]M > [F4 > F5 > F6 > F7 > F8 > F9 > F10 > F11 > F12 > F13 ]M [F1 > F2 ]M > [F3 > F4 ]M > [F5 > F6 > F7 > F8 ]M > [F9 > F10 ]M > [F11 > F12 > F13 ]M [F1 > F2 > F3 > F4 > F5 > F6 > F7 > F8 > F9 ]M > [F10 > F11 > F12 > F13 ]M c. F10 > F2 > F3 > F6 > F5 > F13 > F8 > F8 > F9 > F1 > F11 > F2 > F13 d. [F1 > F2 > F3 > F4 > F5 > F6 ]M > [F6 > F7 > F8 > F9 > F10 > F11 > F12 > F13 ]M (47c) is simply a random assortment of the original set of features, of course inadmissible. But cases like (47d), described with the term multiple exponence, seem to occur rather frequently. An example that will be discussed more thoroughly in section 8.3 is this one from Georgian: (48) g-xedav-s 2sg.obj-see-3sg.subj The problem is that both the prefix and the su‰x are sensitive to both subject and object agreement features, and hence in some sense realize them; but then, no matter what the feature hierarchy is, there is no way to segment it into morphemes by Reassociate. Suppose that the feature hierarchy here is (49a) (where S# stands for subject number agreement, Sp for subject person agreement, etc.). Then an acceptable segmentation would be (49b), but what is found, apparently, is (49c). (49) a. S# > Sp > O# > Op > V b. [[O# > Op] > V] < [S# > Sp] c. [[(S# > Sp >) O# > Op] > V] < [S# > Sp (> O# > Op)] However, the problem would disappear if we were to make ‘‘silent’’ some of the features in the two morphemes (here, as in chapter 8, parentheses indicate silent features). (50) [S# > Sp (> O# > Op)]prefix > [(S# > Sp >) O# > Op]su‰x Now the forms are combinatorially valid. We have drawn a distinction between what a morpheme is paradigmatically sensitive to, and what it ‘‘expresses’’ insofar as the rules for combining forms are concerned. So
196
Chapter 7
both prefix and su‰x can be sensitive to the value of some feature Fi , but only one of them will ‘‘express’’ it. The theory makes the interesting prediction that multiple exponence will always involve features that are adjacent on the functional hierarchy. See chapter 8 for further discussion. Multiple exponence is found in phrasal syntax as well. If we accept the account of head movement–type phenomena that I have suggested, then my proposed theory of multiple exponence can be imported here, making the same very specific predictions about the character of multiple exponence in phrasal syntax. As an example, consider the complementizer agreement phenomena found in certain dialects of Dutch (Zwart 1997). (51) datte wy speult that.pl we play.pl (East Netherlandic; from Zwart 1997) As Zwart notes, in some dialects the morphology on the complementizer di¤ers from the morphology on the verb, while in others the two are identical; in East Netherlandic they are di¤erent. What the agreeing dialects have in common is that the complementizer always agrees with the subject, and it always agrees in addition to (not instead of ) the verb. Let us suppose that the functional hierarchy is C > T > SA > V (SA ¼ subject agreement). Then we can understand the East Netherlandic example in the following way: (52)
datte (pl) [C > (T > SA)] zfflfflfflfflfflfflfflfflfflfflfflffl}|fflfflfflfflfflfflfflfflfflfflfflffl{
C
T
S
...
V
|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
speult [T > SA > > V] That is, the T and SA features are silent on the complementizer (they could as easily have been silent on the verb; see chapter 8 for a discussion of underdetermination of such analyses). From the point of view of syntax, speult will ‘‘be a’’ T > SA > V, and datte will ‘‘be a’’ C and so can combine with the tensed V in accord with the functional hierarchy. 7.4.3 The Distribution of Dual Forms In sum: if X-bar theory is formulated to account for C-P duality, through an extension of Reassociate to the domain of phrasal syntax, then it will
X-Bar Theory and Clause Structure
197
also account for various cases of absorption, head-to-head movement, gapping, and so on. Is it a notational variant of head-to-head movement? Putting aside the dismissive tone of the phrase, we can answer, yes, in some respects. It would be quite surprising if it were not, because the theory of head-to-head movement now answers to numerous welldocumented findings. The theories converge in many ways—for example, with respect to locality and traces. Nonetheless, they are quite di¤erent in character, the one consisting entirely of the laws of X-bar theory, and the other also including movement and the theory of locality that movement requires. I remarked at the outset that C-P duality is a possibility not always realized, in the sense that the two forms it relates do not always both exist, or if they do, are not always equipotent. But why not? Why do we not find a given structure existing alongside all its fully grammatical dual structures? There may be no single answer to this question. In some cases the absence of requisite lexical items is the cause; for example, the T in (35) ([T [NP VP]]) cannot be realized because T does not correspond to any lexical item. Blocking is the answer in other cases. Consider the a` le ) au rule in French. C-P duality gives two structures: (53) a` [le N] , au [0 N] to the N The fact that there is a special lexical item (au) for the right-hand side may be enough to make the left-hand side ungrammatical, through blocking, since the left-hand side is what would be expected on more general grounds, if the item au did not exist. In still other cases both sides of the duality may be permitted to exist if there is some di¤erence in meaning. For example, a dative preposition and a dative Case marking might exist side by side, so long as they differed in meaning. The English SAI ~ declarative duality is clearly another example: the semantic di¤erence is the di¤erence between interrogative and declarative, or, more accurately, between a range of interpretations that includes interrogative (also exclamative, conditional, and imperative) and a range of interpretations that includes declarative. But other discrepancies are unaccounted for. For example, in English only the auxiliary verbs participate in SAI. The SAI verbs must be subcategorized to take the ‘‘absolutive’’ NP VP sequence as their complement, as well as
198
Chapter 7
having their usual VP subcategorization. Only auxiliary verbs as a class have this possibility in English (whereas in German, for example, any tensed verb can participate). In English there are telltale discrepancies suggesting that double subcategorization is in fact the correct way to characterize the situation. (54) a. *Amn’t I going? b. Aren’t I going? c. *I aren’t going? Amn’t is not a possible IP verb, and aren’t is an IP verb with agreement possibilities di¤erent from those of its VP counterpart.
Chapter 8 Inflectional Morphology
8.1
The Mirror Principle
The Mirror Principle is the name of an e¤ect, in that it is derived in theories, not fundamental. Specifically, the e¤ect is that the order of morphemes on inflected verbs seems to reflect the structure of the syntactic categories that dominate that verb. For example, in a language in which the verb is marked for both object agreement and subject agreement, subject agreement marking is generally ‘‘outside of ’’ (i.e., farther from the stem than) object agreement marking; this ordering mirrors the ordering of subject and object in the clause, where subject is outside of object. To give another example, admirably detailed by Cinque (1998), the expression of various kinds of modality by means of a‰xes on the verb mirrors the expression of those same kinds of modality when that expression is achieved by means of adverbs or auxiliary verbs. For example, in English, ability is expressed by the modal can, whereas in the Una language of New Guinea, ability is expressed by a verbal su‰x -ti. Through painstaking language comparisons, Cinque shows that if an auxiliary verb in one language and an a‰x in another language represent the same functional element, then they will occur in the same spot on the functional hierarchy. So long as functional embedding occurs between levels, we have already derived the Mirror Principle in RT. As shown in chapter 7, it arises from the interaction of Shape Conservation with functional embedding. Recall that there are two kinds of embedding, complement embedding and functional embedding. A lexical item is concatenated with its complement, whereas a feature is added to the top of the label of its complement; in both cases the ‘‘>’’ relation is instantiated. Suppose that f is a feature borne by morpheme a.
200
Chapter 8
(1) a. Complement embedding: [a > B]f P b. Functional embedding: [ . . . ][f>B] A familiar example in English is pairs related by do support. (2) a. [didT leave][T>V]P b. [left][T>V]P The following is the example of functional embedding discussed in chapter 7, on the assumption that T marking occurs at SS: (3) a. CS: [amalgamateV . . . ]VP
lhead-of
a b. SS: [amalgamate[T>V] . . . ][T>VP] lhead-of
amalgamateV a amalgamate [T>V]
The addition of T to the VP in SS (giving [T > VP]) is the act of embedding under T that is privileged to occur in SS; the ‘‘percolation’’ of T to the head of VP is necessary to conserve the head-of relation that existed in CS. What happens with one feature happens with any number of features. English is not a good language for illustration; still, supposing that AgrS > T in the functional hierarchy, then (3) would subsequently undergo further functional embedding. (4) [amalgamate[T>V] . . . ][T>VP] ) [amalgamate[AgrS >T>V] . . . ][AgrS >T>VP] In this way, Shape Conservation will derive a mirroring of the syntactic structure of a phrase on the label of the phrase itself. Actually, the Mirror Principle also arises in RT in a di¤erent way. If X-bar theory requires that labels honor the functional hierarchy, and if that requirement applies equally to labels on phrases and labels on heads, then, assuming that the set of features is the same for each, this requirement will enforce mirror e¤ects. For example, the following case will never arise: (5) [ . . . HF1 >F2 >F3 >F4 . . . ]F1 >F3 >F2 >F4 If there is a single functional hierarchy, then it is impossible for both the head and the label on the phrase to respect it, since they have di¤erent orders of elements. But is the Mirror Principle actually true? There are a number of cases, some of which will be discussed later in this chapter, that seem to argue against it. There are languages, for example, where subject agreement marking stands between the verb and object agreement marking, and
Inflectional Morphology
201
there is no reason to think that the syntax of such languages di¤ers from the syntax of English in the relative ordering of the subject and object NPs. How can we respond to such cases? We could abandon the principle, or come to view it as a superficial tendency that does not deserve deep codification in the theory of grammar. Mirroring is a norm, but not required. One version of this strategy calls for a separate set of operations whose role is to mediate between the syntactic and the morphological representations—in other words, a set of rules to fix the mistakes that arise when they don’t match, so-called readjustment rules. The problem with such rules is that, once they are a part of a theory at all, there is no stopping them. If, for example, one’s theory of the syntax-morphology interface includes three (types of ) readjustment rules, and one encounters a language whose morphosyntactic interactions lie beyond those rules, it doesn’t hurt much to add a fourth one. Adding a readjustment rule component to a theory elasticizes it in such a way that it can respond flexibly to new data. While that might be a good property of some engineering projects, it is a bad property of a theory. Cinque (1998) o¤ers the most rigid (i.e., the best) theory of morphosyntax. In this theory all inflectional a‰xation arises from head-to-head movement; as a result, if afi bears feature fi , then the following structure arises and a perfect mirroring results. (6) [[[[[[V þ af1 ] þ af2 ] þ af3 ]3P t]2P t]1P t]VP Cinque’s theory is quite rigid in its predictions, and clearly false, as Cinque himself recognizes. How can it be fixed? One possibility would be to introduce a readjustment component. Cinque steers clear of this blank check and instead suggests (but does not work out) a theory of null auxiliary verbs and applies it to some obviously troublesome cases. I think Cinque’s instinct is correct—not to write a blank check, but to develop a substantive theory—but I would like to suggest a di¤erent solution to the problem of apparent instances of nonmirroring. I think the best place to start is with the recognition that syntax and morphology (i.e., word formation, including inflected-word formation) have di¤erent syntaxes; there are universal di¤erences (syntax includes XPs, morphology does not), and there are language-particular di¤erences (English words are head final, English phrases are not; in other languages, such as Japanese, words and phrases are both head final). But
202
Chapter 8
they have one thing in common: they are both productive systems for, among other things, representing the functional hierarchy. Crucially, they represent the same functional hierarchy, but because they are di¤erent systems, they do so di¤erently. It is of course an empirical fact, or claimed fact, that words and phrases are di¤erent, in this way or in some other way. It is in fact an empirical imposition that there are words—combinations of morphemes that are smaller than phrases—at all. On minimalist assumptions it is not clear that nonmonomorphemic words should even exist; certainly one can imagine developing a mapping between a sound representation and a meaning representation that does not have anything corresponding to morphologically complex words. Artificial logical languages do not have morphology, for example, nor do programming languages (actually, programming languages do have morphology, but as a practice among programmers, not as a part of o‰cial language specification). But given that there are words, and that words cannot be theoretically dissolved into phrases, leaving only morphemes or features, the questions remain, how do words ‘‘work’’ internally, and how do they interface with syntactic representations? In chapter 7 I gave some idea about how phrasal syntax represents the functional hierarchy and how the head of a phrase relates to the head of a word; and I have located the Mirror Principle in that representation and that relation. But now, how do words work themselves? That is, how do words represent the functional hierarchy? The hope is of course that the correct theory of how words work will eliminate the need for any devices that mediate between syntax and word structure; then we will have eliminated any need for the Mirror Principle itself, as it will simply be a property of the architecture of the theory. It will not be the case that morphology mirrors syntax, or vice versa; rather, they will both mirror or ‘‘represent’’ the same functional hierarchy, but in di¤erent ways. I have declared that RT inflectional morphology is lexicalist. But I am sure I have overstated the matter—I am sure it would be possible to make a Checking Theory account of verbal morphology consonant with the rest of RT. As usual, the things we call theories are really much looser than we think. But I do think that a lexicalist morphology is the best kind for RT. For one thing, it makes maximal use of the representation relation to account for inflectional morphology, and for the Mirror Principle in particular. For another, I think it would indeed be peculiar to have
Inflectional Morphology
203
eliminated NP movement from scrambling, and from the association of theta roles with Case, but to still have a Checking Theory for verbs. In what follows I will develop an account of the sublanguage of inflectional morphology as an independent language. I will treat inflected verbal elements and VPs as di¤erent languages, both representing the same set of abstract functional elements, in accordance with the conclusion of chapter 7. I will propose a model for the sublanguage of verbal inflection, a formal language that I think is an accurate model of inflectional morphology, and I will present some statistical evaluation of its accuracy. I will also give some idea of how the model can be applied to other aspects of linguistic structure—in particular, how it models Germanic and Hungarian verb raising and verb projection raising. In general, it seems to be a promising model wherever ‘‘inheritable’’ lexical specifications play themselves out combinatorially without extrinsic inhibition. 8.2
The Language CAT
Let us assume that there is a universal set of elements, as in (7), and that these elements are in a fixed hierarchical relation to one another, as indicated. (7) Universal elements and hierarchy AgrS > T > (Asp >) AgrO > V (or T > AgrS > (Asp >) AgrO > V) The question I want to explore is how such elements can be realized by lexical items. The elements are not themselves lexical items, but they are realized by lexical items. For example, -ed in English realizes T (and maybe other features at the same time), and plead realizes V. To express the fact that one morpheme is above another in the hierarchy in (7), we will endow each element in (7) with a ‘‘subcategorization’’ of the form in (8), adopting the convention that if a morpheme expresses an element, it inherits its subcategorization. (8) a. T: AgrO b. -ed: T, AgrO T takes an AgrO complement, and -ed, because it realizes T, takes an AgrO complement. Given a set of lexical elements each of which expresses one of the elements in (7), we can derive a linear string that contains them
204
Chapter 8
all, if we adopt the X-bar convention that when an element is combined with another element that it is subcategorized for, the result is of the same category as the subcategorizing element (the principle of X-Bar Head Projection). (9) [morpheme1 [morpheme2 [morpheme3 [morpheme4 AgrS T Asp AgrO [morpheme5 ]V ]AgrO ]Asp ]T ]AgrS V Such an account predicts that the surface order of the morphemes will mirror the underlying relation of the elements in (7) to one another. However, in general we find that the surface order of the morphemes of inflected verbs di¤ers from the order in (7). One way to accommodate these di¤erent orders is to generate (9) directly and then apply rules that ‘‘adjust’’ it into another structure. As already mentioned, while this approach is obviously not incoherent and so is conceivably correct, it should nevertheless be suppressed because of the following considerations: first, readjustment rules are an inorganic addition to the theory, and second, their presence undercuts any specific expectations about the surface order of morphemes. While (9) is not the only order realizing (7), it is quite obvious that the possible orders realizing (7) are sharply limited. I have posed the problem of readjustment rules as a problem for derivation of inflected verbs in the lexicon, but it applies equally to models that derive the inflected verb in syntax (as in Cinque 1998) or in ‘‘both’’ (as in Chomsky 1995—some sort of derivation in the lexicon, and feature-by-feature checking in syntax under a ‘‘mirror’’ regime). In Cinque’s model, for example, the verb moves successively through a series of functional projections that define clause structure, picking up one a‰x in each projection under adjunction; this predicts a right-linear string of morphemes mirroring the underlying order of the functional elements, and any deviation must be handled by a di¤erent mechanism. I think a better approach is to somewhat enlarge the combinatorial possibilities among the elements in (7) in the first place. The sole convention that governs combinations thus far is X-Bar Head Projection. Suppose we add to this the convention that a composed unit can inherit a subcategorization as well as a type; this subcategorization is inherited from the nonhead (whereas the type is inherited from the head). Combining the two conventions gives the following Rule of Combination (RC):
Inflectional Morphology
205
(10) a. Rule of Combination X Y þ Y Z ! [X þ Y]X b. X/Y þ Y/Z ¼ X/Z
Z
(10a) is the basic rule of Categorial Grammar (given in that theory’s notation in (10b)), and for this reason I will call the language generated by RC from a set of elements CAT. To illustrate, suppose for simplicity that T takes V as its complement, and some V takes N(P) as its complement; then, we derive the tensed transitive verb to the right of the arrow in (11) by applying RC to the two elements to the left of the arrow. (11) V
NP
þ -ed(T)
V
! Ved, T
NP
RC will also derive such objects as tensed Asp, as in (12). (12) T-morpheme þ Asp-morpheme ! T-Asp T Asp AspAgrO ! T AgrO Such a rule does not by itself allow for the generation of alternative morpheme orders. However, it does allow for more diversity in the structures that can instantiate (7), permitting (in addition to the purely rightlinear structure) such structures as the following, where T and Asp have combined to form an intermediate unit of type T, with subcategorization AgrO : (13) AgrS [[T Asp]T;
AgrO [AgrO
V]AgrO ]]
RC accounts straightforwardly for morphological fusion, the situation in which one morpheme instantiates more than one feature. If some features are permitted to have phonologically empty realizations, derivations like the following will be possible: (14) [e]X
Y
þ [morpheme]Y
Z
! [morpheme]X
Z
This account of fusion predicts that fused elements must be adjacent in the hierarchy in (7), since RC will only combine adjacent elements. The prediction is overwhelmingly true. This model still does not generate alternative orders of morphemes. To generate di¤erent orders, we will relax the interpretation of the subcategorization notation. The traditional notion of subcategorization bundles together three di¤erent kinds of information: type, order, and level. (15) Subcategorization a. Type (N vs. V, etc.) b. Order (left vs. right) c. Level (root vs. stem; X 0 vs. X n )
206
Chapter 8
So NP encodes the idea that the verb takes a nominal object (N), that it takes it to the right ( X), and that it takes a phrase-level complement (NP as opposed to N). I want to investigate the properties of the language that results from relaxing the order and level restrictions, retaining only type subcategorization. Relaxing the order restriction means that if V takes N(P) as a complement, then it can take it either to the right or to the left [V N] or [N V]. To eliminate ambiguity about which element takes which as complement in a structure, I will use the sign ‘‘>’’ introduced in chapter 7 to indicate the relation of head to complement, with the narrow end pointing to the complement. For example, if V takes an N complement, then both of the following constructions are licensed when the order restriction is dropped: (16) a. [N < V] b. [V > N] I will now define CAT to be the language that is generated by a set of elements in head-complement order under the RC, where subcategorization specifies type only, leaving level and order free. (17) CAT ¼ {A( B), B( C), C( D) . . . þ RC} CAT uses type subcategorization only Put di¤erently, CAT is the set of permutations that arise from suspending order and level subcategorization. I will now determine some properties of CAT with an eye to evaluating its role as a model of some linguistic systems, inflectional morphology among them. The first thing to establish is the relation of CAT, where order and level are relaxed, to the language that results when they are enforced. When the elements in (17) are combined in such a way that order is fixed, subcategorization is not inherited, and only head type is projected, they determine a single structure, which I will call the right-linear string (RLS). (18) Right-linear string [A > [B > [C > [D > [E]E ]D ]C ]B ]A The RLS is the model of Pollock/Cinque-style clause structure, and, via the Mirror Principle, the model of inflectional morphology that is widely assumed. The RLS bears a particular relation to CAT that can be explicated by defining two CAT-preserving operations, Flip and Reassociate.
Inflectional Morphology
207
(19) a. Flip If X ¼ [A > B], A and B terminal or nonterminal, Flip(X) ¼ [B < A]. b. Reassociate If X ¼ [A > [B > C]], R(X) ¼ [[A > B] > C]. Flip is CAT preserving in the sense that if [A > B] belongs to CAT, then it is guaranteed that [B < A] belongs to CAT, by virtue of CAT’s indifference to order. To show that Reassociate is CAT preserving, we reason from the RC in this way: in X ¼ [A > [B > C]], [B > C] is of type B, with subcategorization the same as C’s; so A must have subcategorization B if X belongs to CAT; but then A must be directly combinable with B, and the result of that combination will have subcategorization C; so, given the RC, [[A > B] > C] must also belong to CAT. So, both operations are CAT preserving. Furthermore, both have obvious inverses, and the inverses are also CAT preserving. We can now show that CAT is the language that can be generated from the RLS by Flip and Reassociate. We do this by showing that any member X of CAT can be mapped onto the RLS by some combination of Flip and Reassociate, and since these are invertible and CAT preserving, that mapping can be viewed backward as a generation of X from the RLS by some combination of Flip and Reassociate. Suppose there is a structure X that is a member of CAT but cannot be mapped onto the RLS by Flip or Reassociate, or their inverses. Then, there must be some node in X that is either a left-branching structure ([[A > B] > C]) or a structure of the form [A < B], for if there are only right-branching structures and rightward-pointing carets, then the structure is the RLS. In the first case, if right association cannot convert X to [A > [B > C]] by reasoning already given, then it cannot belong to CAT in the first place, and likewise for the second case; hence, there can be no such structure. So, (20) CAT ¼ RLSþ where by RLSþ I mean the language generated from the RLS by Flip and Reassociate. The properties of CAT just identified are useful in discussing CAT as a model of linguistic systems. By virtue of Flip and Reassociate, CAT can be taken as a model of systems that appear to involve movement. In fact, CAT, via its RLSþ interpretation, mimics movement of constituents of
208
Chapter 8
arbitrary size, over arbitrary distances. To see this, consider the RLS in (21a), and whether H in that structure could be moved to the position between B and C. (21) Flip and Reassociate can e¤ect long-distance moves, of any node to any higher position. a. [A > [B > [C > [D > [E > [F > [G > [H > [I J]]]]]]]]] 5 Derivation: b. [A > [B > [C > [D > [E > [F > [G > [H > [I J]]]]]]]]] Reassociatem c. [A > [[[[[[B > C] > D] > E] > F] > G] > H] > [I > J]] Flipm d. [A > [H < [[[[[B > C] > D] > E] > F] > G]] > [I > J]] In the derivation (21b–d), first several applications of Left-Reassociate gather all of the material intervening between the moving item and the landing site, and then a single Flip e¤ects the movement. It is important to understand that as far as CAT is concerned, there is no movement; rather, there is a theorem that if (21b) belongs to CAT, then so does (21d); Flip and Reassociate are simply a way of thinking about this via the RLSþ interpretation of CAT. Nevertheless, these conclusions invite us to consider CAT as a model of linguistic structures that appear to involve movement. While a single unbounded movement is allowed, multiple movements are quite constrained. The Flip operation in (21c) reverses the caret, thus blocking any further applications of Reassociate. Hence, any further movement in the vicinity of the movement path will be blocked; in particular, there will be (22) a. no movement of the moved constituent b. no movement out of the moved constituent (where it is complex) c. no movement out of extracted-from constituents It is again important to realize that these are not constraints that need to be imposed on Flip and Reassociate; they all reduce to theorems about CAT. A question I have not been able to answer is, is any system of transformations of the RLS constrained by (22) equivalent to CAT or RLSþ? Because of the restrictions in (22), CAT cannot be used to model wh movement, as wh movement does not conform to any of them. CAT thus
Inflectional Morphology
209
di¤ers from full-blown Categorial Grammar. In particular, it does not have ‘‘type lifting,’’ which can be used to evade (22). I will now try to assess how big CAT is. If the set of base elements is finite, as it is in the cases we intend to model, CAT itself is finite. As I characterized it earlier, for some fixed chain of elements in the complement-taking relation, CAT defines some set of permutations of those elements. The full set of permutations of n elements (call it P) has n! elements (n (n 1) (n 2) 2). As n grows, CAT becomes a tiny subset of P; for this reason, any system of a certain size that resembles CAT most likely is CAT. For three elements, CAT is actually identical to P, but for any larger n it is not. (23) Suppose 1 > 2 > 3 > 4 > 5. Then: 3: 123 1 [3 2] [2 < 1] > 3 3 < [1 > 2] 4: 1 2 [3 > 4] 1 2 [4 < 3] 3 [1 2] 4 *3 1 4 2 5: *3 1 5 2 4, etc.
[2 3] 1 3 < [2 > 1] 1 > [[3 < 2] > 4] *2 4 1 3
The starred strings are the non-CAT strings for n ¼ 3, 4, and 5. To see that they are non-CAT, we can try to build a parse tree for them from the bottom up; for the examples given, there is no way to start building the tree, because no adjacent elements are combinable in either direction (this does not, however, characterize all failures of strings to be members of CAT). The non-CAT strings given here are derivable from the RLS by movement free of the constraints in (22). For example, (24) gives the derivation of ‘‘*3 1 5 2 4.’’ (24) (1 2 3 4 5) ! 1 5 2 3 (4 t) ! 3 1 5 2 t (4 t) ¼ 3 1 5 2 4 In the first step 5 is extracted from ‘‘2 3 4 5’’; in the second step 3 is extracted from that as well, violating the prohibition against extraction from extracted-from constituents. In what follows I will try to give some idea of how fast CAT grows relative to P. The table in (25) shows how many elements of P are excluded from CAT for n ¼ 1 . . . 9, and the percentage excluded. Evidently, CAT becomes a vanishing portion of P.
210
Chapter 8
(25) # 3 4 5 6 7 8 9
Total : Excluded-from-CAT
% excluded
6:0 24:2 120:30 720:326 5,040:3,234 40,320:31,762 362,880:321,244
0 8.3 25.0 45.3 64.1 78.8 88.5
I have not been able to devise a formula that will give the number of CAT elements for n elements, so the figures in (25) were calculated by hand. There is a formula that puts an upper bound on CAT and is still smaller than P; the table in (26) compares the value of this formula with P. (FR ¼ Flip-Reassociate upper bound) (26) FR ¼ 2 2n3 n
Ratio of n! to FR(n)
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
1.00e þ 000 7.50e 001 7.50e 001 9.38e 001 1.41e þ 000 2.46e þ 000 4.92e þ 000 1.11e þ 001 2.77e þ 001 7.61e þ 001 2.28e þ 002 7.42e þ 002 2.60e þ 003 9.74e þ 003 3.90e þ 004 1.66e þ 005 7.45e þ 005 3.54e þ 006
Inflectional Morphology
211
The formula is arrived at by considering each node to be independently flippable, and each pair of adjacent nodes to be independently reassociable; since there are n 1 of the former and n 2 of the latter, there are (27) 2 n1 2 n2 ¼ 2 2n3 ways to transform the RLS to generate CAT. But this overestimates the actual number of permutations: any pair of adjacent unflipped rightassociated nodes in a structure X can be left-associated to yield another member of CAT that has the same order of terminal elements, so the same permutation of elements will be counted twice. I have not figured out a way to subtract or to estimate the size of such redundancies. Clearly, if we were modeling a linguistic system involving 15 concatenating elements, and the observed permutations of these elements were found to conform to what would be expected of a CAT system, we would have resounding confirmation that CAT is a good model of the system, since the chance that exactly these orders would arise in a system not essentially equivalent to CAT would be small. Unfortunately, most linguistic systems do not involve the concatenation of such large numbers of elements; some cases of interest, such as inflectional systems, may involve 4 to 6 elements, and at that level the di¤erence between CAT and P is not astronomical. Conclusions can nevertheless be drawn for systems of this size as well, if a number of di¤erent languages are considered. For example, since the chance that 10 languages with 5 morphemes are all CAT ¼ .75 10 ¼ 5%, one could claim significant confirmation of the CATlike behavior of the subsystem in question from a collection of 10 such languages. With this in mind, in section 8.4 I will survey inflectional systems with 4 and 5 morphemes to assess CAT as a model of inflection. 8.3
Inflectional Systems as an Instantiation of CAT
Suppose we have a fixed universal chain of elements in the complement-of relation, as in (28). (28) Universal elements and hierarchy AgrS > T > Asp > AgrO > V, or perhaps T > AgrS > Asp > AgrO > V (type subcategorization only) As before, the caret in X > Y means ‘X takes things of type Y as complement’, but with no restriction on the linear order of the elements or on the ‘‘level’’ (i.e., bar level, as in X-bar theory) of the elements.
212
Chapter 8
CAT with (28) as its base is clearly not a good model of any particular language’s inflectional morphology, as no language has inflectional morphology where, for example, the past tense a‰x may freely occur either before or after the verb (corresponding to Flip). Any given language will fix the linear order. In addition, any given language will fix the ‘‘level’’ at which items attach, in a way that I will make precise. We might say that CAT models inflectional morphology in the sense that it sets the limits on possible realizations of the universal chain in (28), but that any particular language will impose order and level constraints on the subcategorization of particular items that will yield some subset of CAT. In particular, it would be interesting to explore the possibility that the only way inflectional systems can di¤er is in terms of these two properties. (29) is an attempt to formulate this hypothesis. (29) Lexical Variation Hypothesis Language-particular inflectional systems di¤er only in a. order restrictions b. level restrictions on the subcategorizations of individual morphemes or classes of morphemes. The Lexical Variation Hypothesis (LVH) is independent of whether CAT is a good model of inflection in general; it could be that CAT sets accurate bounds on what permutations of elements in general can instantiate the chain in (28), but that the way languages di¤er within that bound is something other than (29). In what follows I will be evaluating the LVH as well as CAT, but CAT is the main prey. The order restriction determines the di¤erence between prefix and su‰x for morphemes, and the di¤erence between head-initial and head-final order in syntax. The level restrictions have to do with what ‘‘size’’ the complement must be. The details depend on assumptions about what units are available in the first place. Two cases will be of interest here. One, already mentioned, will be the word/phrase distinction; the subcategorization N, for example, I will take to be ambiguous between N 0 and NP. In addition, we will need recourse to levels of morphological structure, the most familiar version of which is the root/stem/word distinction introduced in Selkirk 1982, where stems are composed of roots, but not vice versa, and words are composed of stems, but not vice versa, giving a three-way distinction among levels. So we will allow a language to impose a restriction on an
Inflectional Morphology
213
AgrO morpheme, for example, that it attach to a verb root, and not to any other level of verb, in accordance with the LVH. I should note that this system will give ambiguous derivations for cases that are not normally ambiguous, and where there is no obvious semantic ambiguity. To take an example from English derivational morphology, if both -ate and -ion are type 1 (root-attaching) su‰xes, then both of the following structures will be allowed: (30) a. [[a¤ect þ at] þ ion] b. [a¤ect [at þ ion]] If the subcategorizations and restrictions are satisfied in (30a), then under the RC, they must be satisfied in (30b) as well. The possibility of structure (30b) might be welcome for such cases, as there is some tendency to think of -ation as a single a‰x in such cases; in the present case, for instance, there is no word *a¤ectate. Strictly speaking, a further unfamiliar sort of derivation should be possible as well. Typically, the lexicon is divided into roots, stems, rootattaching a‰xes, and stem-attaching a‰xes. But, in fact, the system proposed here does not give exactly this classification. Consider the properties of -able and -ity listed in (31a,b). (31) a. b. c. d.
-able, A, Vstem -ity, N, Aroot -[ability]: N, Vstem [compact þ [ability]]
The question raised in (31c) is, can -ity attach directly to -able, to derive the complex su‰x -ability, with the properties shown in (31c) (derived by the RC)? The question comes down to whether or not -able can satisfy the subcategorization of -ity, and crucial to the answer is whether it satisfies the restriction that -ity attaches only to roots. Now, -able itself attaches to stems, but this leaves open the question whether it is itself a stem or a root or both. If we decide that it can be a root, then there is nothing to block (31c), and so (31d) will be a typical nominalization using these two a‰xes. If CAT is right, then these ambiguities are harmless; if they can be verified, I would in fact consider them confirmatory, because they would be puzzling without CAT. For each language I will examine in section 8.4, I will ask two questions. First, is the order of inflected elements a CAT order or not? Second, is there a reasonable specification of order and level restrictions on the morphemes that instantiate the functional elements that will yield the
214
Chapter 8
particular shape of the inflected word in that language? The first question addresses CAT by itself, the second, CAT plus the LVH. I will begin with the assumption that the chain of elements in (28) is the fixed universal base for CAT; any flexibility introduced into this assumption would not be necessarily incompatible with CAT, but it would weaken empirical expectations. If, for example, T and AgrS were ordered di¤erently in di¤erent languages, we would simply have di¤erent bases for CAT in those di¤erent languages. In order to have a verbal morphology, a language needs a set of morpheme classes that span the functional chain. Recall that a morpheme can span subchains of the functional chain through fusion, which arises when one of the morphemes that the RC combines is a null morpheme. In general, the fusions that occur in a language are systematic; for example, in English AgrS and T always fuse. Such generalizations are part of the lexical style of the language; but, while fascinating in their own right and essentially not understood, they are not directly the subject at hand. In (32) the set {m1 , m2 , m3 } is a spanning vocabulary for F1 . . . F6 . (32)
If the RC generates m1 , m2 , and m3 , then it is guaranteed that m2 can combine with m3 , and the result of that combination can combine with m1 , and so [m1 [m2 m3 ]] will span the functional structure. The spanning vocabulary might consist of a‰xes, in which case single inflected words will span the functional structure; or it might consist of words, in which case syntactic constructions will span the functional structure (giving rise to what are called auxiliary verb systems); or it might consist of some combination of the two. In English, for example, the spanning vocabulary consists of both words and roots and a‰xes. (33) T > AgrS > Asp > AgrO > V |-----was-----| |-------seeing-------| Was is a word that spans T and AgrS ; seeing is a (derived) word that spans Asp, AgrO , and V (under the assumption that AgrO is universally a part of the chain). Was and seeingP can be combined in syntax, since was is a T, AgrO element and seeingP is a projection of the AgrO element seeing.
Inflectional Morphology
215
(34) a. In lexicon ing: AgrO þ Asp, V see: V, NP see þ ing ! seeing Asp, NP b. In syntax was: T, AspP seeing: AspP, NP seeing: AspP, NP þ NP ! [seeing NP]AspP [was] þ [seeing]AspP ! [was [seeing NP]]TP Importantly, it is the RC that is responsible for the operations in both syntax and morphology. The only di¤erence is that in morphology it combines X 0 -level objects, whereas in syntax it combines X 0 - and XPlevel objects; but this is the characteristic di¤erence between syntax and morphology in any event. From that di¤erence arises the further di¤erence that inheritance of subcategorization largely has no e¤ect in phrasal syntax, since XPs have no subcategorization. It is possible that there are phrasal syntax junctures of units smaller than XP, in which case inheritance should be detectable again; I will suggest in section 8.5 that this is the correct view of the syntax of verb-raising constructions. An obvious di‰culty for the notion of spanning vocabulary, as it arises from the RC, is the existence of multiple exponence. Multiple exponence (the expression of a single functional element on more than one morpheme in an inflected verb) should be impossible given the RC. This is because if a feature is in two morphemes, there is no way those morphemes can be combined by the RC: the subcategorization of one can never match the type of the other, nor can they be hooked together by any intermediate morphemes, for essentially the same reason. Thus far I have assumed that the functional elements that a morpheme ‘‘realizes’’ will be exactly the set of elements that the shape of the morpheme is sensitive to. This is a very natural assumption; for example, the fact that the appearance, or not, of the -s marker on English verbs is sensitive to the functional elements of Tense and Person and Number leads us to suppose that -s ‘‘represents’’ these features. To account for the possibility of multiple exponence, we must pull apart somewhat these two properties of morphemes. We must allow a morpheme to be ‘‘sensitive to’’ more features than it realizes. The result will inevitably be a weaker theory, though there is at least one version that has some teeth to it. Suppose the functional elements that a morpheme represents must be
216
Chapter 8
some continuous subsequence of the full chain of functional elements that it is sensitive to. This would allow a notation in which the functional elements that the morpheme is sensitive to, but that it does not represent, can simply be marked as ‘‘inert’’ for the purposes of the RC. (I will use parentheses to mark such an inert subsequence.) (35) Multiple exponence T > AgrS > Asp > AgrO > V |-------------af1 (--------------)| |------af2 ------| Suppose that af1 is ‘‘sensitive to’’ the features T through AgrO , whereas af2 is sensitive to Asp through AgrO . Without the notion of inert element the RC could not combine both af1 and af2 with a verb, to derive forms like (36), because both a‰xes would be subcategorized for AgrO , but neither would be AgrO . (36) V þ af1 þ af2 But suppose Asp and AgrO are inert for the purposes of the RC, even though they are relevant for the paradigmatic behavior of the class of a‰xes that af1 belongs to. The resulting representations will be as in (37) and the RC can combine them both with the verb as in (37c), since now the type of af2 matches the subcategorization of af1 . (37) a. af1 : T > AgrS > (Asp > AgrO ) b. af2 : Asp > AgrO Asp, V c. [[V þ af2 ]Asp þ af1 ]
T, Asp
The restriction of inert elements to a single subsequence is an empirically sharp prediction, though I am not in a position to present evidence that would confirm or refute it. We could envision an even tighter version, in which the subsequence was always peripheral; again, I have no idea how such a restriction would fare empirically, but it is easy to imagine a kind of language that would refute it. But one problem with the account as it stands is that it permits arbitrary choices in determining which a‰x has the inert features, and where features are inert. Imagine two a‰xes, af1 and af2 , each sensitive to the same subchain of three elements. (38) a. . . . F1 > F2 > F3 . . . af1 : F1 > F2 > F3 af2 : F1 > F2 > F3 b. V þ af1 þ af2
Inflectional Morphology
217
There are six di¤erent ways that inertness could be assigned so that af1 and af2 can be combined with a verb as successive morphemes, as in (38b); (39) shows three of them. (39) a. af1 : (F1 > F2 > F3 ) af2 : F1 > F2 > F3 b. af1 : (F1 > F2 ) > F3 af2 : F1 > (F2 > (F3 ) c. af1 : (F1 ) > F2 > F3 af2 : F1 > (F2 > F3 ) We can probably rule out (39a) on general grounds: it gives af1 no features for the RC to use in deriving complex verbs, so such an a‰x would never appear in a derivation that was purely the result of successive applications of the RC. As for the di¤erence between (39b) and (39c), there is an interesting connection between inertness of features and paradigm structure that could be used to give determinate analyses in such cases. Elsewhere (Williams 1997) I have proposed that the inert elements will always be minor paradigm dimensions, and the noninert elements will be major paradigm dimensions. Major dimensions represent the broadest subdivisions in the paradigm, and evidence for major versus minor status comes from studying syncretism in the paradigm. The fact that all English past tense forms fall together (e.g., pleaded is the past form for all persons and numbers) is evidence for the major status of Tense in English, whereas the fact that English 3rd person forms do not fall together ( pleads vs. plead ) shows that Person is a minor dimension. This connection to paradigm structure could resolve ambiguities in the lexical assignment of inertness. If the analysis in (39b) were correct, for example, we would expect F3 to be more major than af2 , but no such expectation arises from analysis (39c). There are in fact languages with exactly the a‰x pattern illustrated in (39), Arabic and Georgian being among those I have analyzed as just proposed. In each language there are two morpheme classes (a prefix class and a su‰x class) each of which is sensitive to exactly the same set of features ({SubjNumber, SubjPerson, ObjNumber, ObjPerson} in Georgian and {SubjGender, SubjPerson, SubjNumber} in Arabic). (My analyses were based on prior studies by Anderson (1992) and Noyer (1992), respectively.) There is a huge potential for ambiguity in the assignment of inertness for the Georgian case especially, where four features are implicated. (40) lists half of the possibilities.
218
Chapter 8
(40) a. af1 : F1 > F2 > F3 > F4 af2 : (F1 > F2 > F3 > F4 ) b. af1 : (F1 ) > F2 > F3 > F4 af2 : F1 > (F2 > F3 > F4 ) c. af1 : (F1 > F2 ) > F3 > F4 af2 : F1 > F2 > (F3 > F4 ) d. af1 : F1 > (F2 > F3 > F4 ) af2 : (F1 ) > F2 > F3 > F4 e. af1 : (F1 > F2 > F3 > F4 ) af2 : F1 > F2 > F3 > F4 In both languages examined it turns out that if Fi is a major dimension for af1 , then it is a minor dimension for af2 , and vice versa. The Georgian inflected verb, for example, has the form in (41), where both a‰xes are sensitive to both subject agreement features and object agreement features. (41) F1 > F2 > F3 > F4 af1 þ root þ af2 But an examination of syncretisms in the paradigms for the two a‰xes shows that Subject features are major dimensions for the su‰x, and minor for the prefix, and Object features are the opposite, so that the underdetermination is resolved, giving a system something like (40c). It is perhaps surprising that the paradigms associated with morphemes sensitive to identical sets of features should not have the same major/ minor dimensional ordering within the same language, but that may be one of the milder surprises in store for us in the much-studied but littleunderstood human ability to build paradigms. The structure of paradigms is not the subject of this book; but see Williams 1997 for a discussion of the paradigms from Arabic and Georgian that substantiate the claims about paradigms made here. For present purposes it is enough to know that the hypothesized connection to paradigm structure can eliminate the arbitrariness of determining what functional elements are inert in what morphemes and can therefore yield more determinate analyses. 8.4
Some Inflectional Systems
I have already outlined the enterprise of this section. For each language I will first, determine whether CAT plus a universal base of functional ele-
Inflectional Morphology
219
ments sets the proper bounds on what an inflectional system can do to represent functional elements; and second, see if the details of word shape in particular languages can be predicted by specifying level and order restrictions on particular morphemes or classes of morphemes, in accordance with the LVH. The simplest sort of language from an inflectional point of view is one where the RLS of functional elements is realized as an RLS of su‰xes. (42) V þ af2 þ af3 þ af4 V AgrO T AgrS Such a language is the one expected in particular in Cinque’s version of the Pollock-style model, in which the verb moves in syntax through the head position of a series of functional projections, one projection for each functional element, picking up an a‰x in each move by left-adjoining to it. In the terms I will use to describe inflectional systems here, it is a language that exhibits no fusion, and in which each morpheme takes its complement to the left. If we use a kind of level restriction to bar a‰xation to other a‰xes, then exactly the left-linear structure will result. (43) [[[V þ af1 ] þ af2 ] þ af3 ] While languages do exist that so transparently represent the functional chain, they are somewhat rare. More complex is a language with some fusion, but with the transparently mirroring order of markers. Consider for example Mohawk or Southern Tiwa, whose verbal inflectional systems look like this: (44) a. Ka-’u’u-wia-ban. 1subj.2obj-baby-give-past (Southern Tiwa) b. [AgrS > AgrO > V] < T [AgrS ¼ AgrO > V] < T c. T: su‰x, T, AgrS AgrS ¼ AgrO : prefix arising from fusion: AgrS , V V: stem Example (44a) shows that subject and object agreement marking are fused into one morpheme, ka. (‘‘¼’’ represents the boundary at which two adjacent elements are fused.) Mohawk and Southern Tiwa have the further complication that T is on the opposite side of the stem, but as the parse in (44b) suggests, this is not a problem for the hierarchical relation
220
Chapter 8
among the elements. The match between functional elements and morphemes is one-to-one except for the single fusion. Note that Mohawk and Southern Tiwa require the complement order AgrS < T so that AgrO and AgrS will be adjacent, hence able to fuse; it remains to be seen if that order is universally possible. (44c) shows the language-particular specifications that determine the shape of the inflected verb. Swahili, which does not exhibit fusion, would at first glance seem to provide an even more transparent representation of the functional elements and thus an exact match between morphemes and functional elements. But findings reported by Barrett-Keach (1986) show that the Swahili inflected verb does not instantiate the RLS. Barrett-Keach shows that the inflected verb has an internally bifurcated structure, as illustrated in (45b). (45) a. [AgrS þ T þ AgrO þ V]word b. [[AgrS þ T] [AgrO þ V]]word Barrett-Keach gives two kinds of evidence for this conclusion. First, the inflected verb has the accent pattern that Swahili assigns to compound terms generally, including nominal compounds: main stress on the penultimate syllable of the second element, and secondary stress on the penultimate syllable of the first element. (SP and OP stand for subject and object pronoun clitic.) (46) Juma a-li-ki-soma kitabu. Juma sp-past-op-read book ‘Juma read the book.’ (Barrett-Keach 1986, (1a)) This would follow if the structure in (45b) were correct, and the two constituents of the inflected verb were identified as stems. (47) [[T AgrS ]stem [AgrO V]stem ]word Barrett-Keach’s second piece of evidence is that Swahili has a su‰x, cho, indicating relativization, which can appear in the middle of the inflected verb, exactly between the two hypothesized stems. (48) kitabu a-li-cho-ki-soma book sp-past-rel-op-read ‘the book which s/he read’ (Barrett-Keach 1986, (10))
Inflectional Morphology
221
Cho is clearly a su‰x, because it can also be appended to the complementizer (amba þ cho). It can receive a unitary account only if a-li-cho-kisoma has the internal structure indicated in (45), which allows cho to be appended to the first stem of the inflected verb. We can achieve that structure by stipulating the following language-specific constraints on the morphemes that realize the functional elements: (49) T: prefix AgrS : stem AgrO : prefix V: stem T and AgrS compose a stem through a‰xation, as do AgrO and V; then compounding (actually, the RC applying to two stems) assembles the complete inflected verb from these subunits. Swahili illustrates what might be called a word-internal auxiliary system (the T-AgrS stem), and this treatment of it prefigures my general treatment of auxiliary systems. I now turn to the more problematic cases for theories that essentially expect the RLS (or LLS) as the only realization of functional elements. The first is Navajo, in which AgrS intervenes between AgrO and V. (50) AgrO Asp T AgrS V There are two ways to parse this structure in CAT terms, depending on whether T > AgrS (51a) or AgrS > T (51b). The lexical specifications needed to force the analysis are given below each parse. (51) a. [AgrO < [Asp < [T > AgrS ]]] > V T: prefix AgrS : stem Asp: prefix AgrO : prefix b. [[AgrO < Asp] < T] < [AgrS > V] T: su‰x Asp: su‰x AgrS : su‰x AgrO : stem Mohawk and Swahili both require T > AgrS , so we might want to tentatively assume that as the universal order and therefore favor parse (51a). On behalf of parse (51b) we could point to the uniform su‰xation to the AgrO stem that would result; although mixed systems exist with both
222
Chapter 8
prefixes and su‰xes, the economics of the lexicon may favor uniform prefixation or su‰xation. I leave the question open. Inuit is the mirror image of Navajo, with AgrS between V and AgrO as a su‰x. (52) a. V T AgrS AgrO b. Piita-p mattak niri-va-a-0. Piita-erg mattak.abs ate-indic-3sg.subj-3sg.obj ‘Piita ate the mattak.’ (Bok-Bennema 1995, 105) (53) V < [T > AgrS > AgrO ] AgrS : prefix T: prefix AgrO : stem (52) shows the order of elements, and (53) shows the parse and the lexical specifications that force the analysis. As with Navajo, a di¤erent parse results if AgrS > T. In Yuman, Lakhota, and Alabama, on the other hand, there is a CAT parse only if T > AgrO . (54) AgrO AgrS V T [[AgrO < AgrS ] > V] < T (P. Munro, personal communication) There is no parse if AgrS > T, as the string then represents the ‘‘3 1 4 2’’ configuration already shown to lie outside CAT. We now have a conflict between the requirements of two di¤erent languages: Navajo requires T > AgrS because its fusion of AgrS and AgrO entails that these must be adjacent in the chain; Yuman, on the other hand, requires AgrS > T. An obvious way to resolve this would be to allow languages to di¤er in their choice on this point, or, equivalently, to claim that there are two distinct notions of Tense, one superior to and the other inferior to AgrS . Convincing support for the latter position would of course be a language in which both occur. I will leave the matter unresolved. In English, auxiliary verbs are part of the spanning vocabulary. The auxiliary verbs take their complements in syntax rather than morphology; consequently, their complements are XPs rather than Xs. The special feature of the morphology is that only one a‰x occurs on the verb; AgrS and T are always fused. Temporarily abandoning the fixed universal chain, I will interpolate some Asp elements and a Voice element in the
Inflectional Morphology
223
functional chain. How this new chain is related to the universal chain will be left open. (55) English functional chain AgrS ¼ T > Asp1 > Asp2 > Voi > V How various elements of the spanning vocabulary are related to the functional chain is shown in (56). (56)
T might
Asp1 have
Asp2 been
Voi being
V killed |-killed-| |------killing------| |-----------killed-----------| |------------------kill------------------| |-----------------------kills-----------------------| |------------passive--was------------| |-------has--------| |--been--| |-modal-| [----s-----|
The complex items shown here are derived by the RC from more elementary morphemes; for example, kills, which spans the whole chain, is formed from kill and -s. The basic rule for relating form to function here is the following: the stem of a form is determined by the left edge of its span, and the form of that stem by the right edge. For example, has spans T and Asp1 . If it spanned more to the right, a di¤erent stem would be used (has vs. was); if it spanned less to the left, a di¤erent form would be used (e.g., has vs. have). A given clause will span functional structure by a combination of morphological and syntactic derived units. For example, (57) shows the derivation of the pieces of John was being sued, from bottom to top. (57) [be < ing]Voi , V [sue < ed]V , NP [being > suedP]VoiP John [was > [being > sued]]T
morphology morphology syntax syntax
The RC applies in both the lexicon and the syntax. The only di¤erence in the outcome is determined by independent di¤erences between morphology and syntax: complementation is left-headed in syntax but right-headed
224
Chapter 8
in morphology, and complements are phrases in syntax but X 0 s in morphology. The RC, along with lexical specifications of the type that the LVH a¤ords, thus lays out a good first approximation to the general question, what is a possible verbal inflectional system in natural language? The fact that the RC is invariant across agglutinating and isolating systems makes it the only real candidate for a general answer. In what follows I will sketch its role in other domains. 8.5
Verb (Projection) Raising as an Instance of CAT
I now turn to an application of CAT outside inflectional morphology: namely, the realm of verb projection raising. In fact, I believe the applications of CAT outside morphology are numerous, and I have picked verb projection raising merely as an illustration. My best guess is that CAT is the relevant model of a system that involves only the playing out of lexical specifications of type, order, and level. The analysis presented here is based on Haegeman and Van Riemsdijk’s (1986) discussion of the phenomenon. The model I present below incorporates insights from their work, but rejects the role of movement in the system, deriving all forms directly by the RC and lexical specifications of order and level. Example (58) illustrates verb raising in Dutch. (58) a. *dat Jan een huis kopen wil that Jan a house buy wants ‘that Jan wants to buy a house’ b. dat Jan een huis wil kopen c. *dat Jan wil een huis kopen (H&VR 1986, 419)
(‘‘DS’’)
NP < V < V
(VR) (VPR)
NP < [V > V] V > [NP < V]
(58a) is the (ungrammatical) deep structure in Haegeman and Van Riemsdijk’s (H&VR) model; (58b) is the verb-raising (VR) structure. (58c) is the verb projection raising (VPR) structure, which is ungrammatical in Dutch. In the VR construction, an embedded verb is raised out of its complement and adjoined to the matrix verb, to the right; in the VPR construction, the same operation is performed on an embedded VP. While VPR is ungrammatical in Dutch, it is found in some other Germanic dialects, such as West Flemish (59) and Swiss German (60).
Inflectional Morphology
(59) a. da Jan een hus kopen wilt b. da Jan een hus wilt kopen c. da Jan wilt een hus kopen (H&VR 1986, 419)
225
(VR) (VPR)
(60) a. das de Hans es huus chaufe wil b. das de Hans es huus wil chaufe c. das de Hans wil es huus chaufe (H&VR 1986, 419)
NP < V < V NP < [V > V] V > [NP < V]
(VR) (VPR)
NP < V < V NP < [V > V] V > [NP < V]
I will analyze the VR and VPR constructions as instantiations of CAT. This means that CAT sets the outer bounds on the form that these constructions can take. It also means that all variation will be found in the level and order subcategorizations of predicates or classes of predicates. In the right margin of the constructions listed above are the CAT representations. If we were to interpret CAT as RLSþ (or, more appropriately, LLSþ: just like RLSþ, but using the LLS instead of the RLS as the base), then we would take (59a) as the LLS and [[NP < V1 ]VP < V0 ] as the base structure, and we would apply Flip and Reassociate to derive [NP < [V0 > V1 ]], which is the West Flemish VR structure. Some mechanism would be needed to guarantee that Reassociate and Flip applied obligatorily in this case. I will instead model V(P)R directly as CAT, in accordance with the LVH. Under this interpretation Flip will correspond to order : right, absence of Flip to order : left, and optional Flip to unspecified order. LeftReassociate will correspond to level : X 0 , which gives VR; lack of LeftReassociate will correspond to level : XP; and optional Left-Reassociate will correspond to unspecified level. It seems to me that the entire range of constructions discussed by H&VR can be described in these terms. Dutch, for example, obligatorily Flips embedded verbs, but never VPs; in CAT terms this means that the verbs in question have the subcategorization shown in (61). (61)
V0
Modal verbs are exceptional in that they undergo Flip optionally. (62) a. dat ik hem zien wilM that I him see want b. dat ik hem wilM zien (H&VR 1986, 426)
226
Chapter 8
In CAT terms this means that the order parameter is unset for these verbs; or, equivalently, they have the additional subcategorization in (63). (63) V 0 There is an unexpected exception to (63): only basic Vs can have this subcategorization, not V 0 s that are themselves complex verbs. (64) a. *dat ik hem kunnen zien wilM that I him can see want b. dat ik hem wilM kunnen zien ‘that I want to be able to see him’ (H&VR 1986, 426) This restriction is intuitively a level constraint: complex [V V] structures are ‘‘bigger’’ than simple Vs. If we use the term stem in such a way that it includes simple Vs, but excludes V-V compound verbs, then we could add the level restriction to (63) to get (65). 0 (65) Vstem
In all of these cases the derived verb cluster has the same subcategorization as the complement verb in the cluster, as determined by the RC. As a result, hem is the direct object of the complex cluster in (64b), for example, so the CAT structure of that clause is as follows: (66) dat ik [hem < [wil > [kunnen > zien]V ]V ]VP German has obligatory Flip for auxiliary verbs (H&VR 1986, 427) but optional Flip for modals; these are straightforwardly treated as order constraints on the model of Dutch. West Flemish obligatorily Flips either the V or the whole VP around a modal or auxiliary, as in (59). The order and level restrictions that account for this are as follows: (67) M,A: V The notation V is to be understood as ‘V 0 or VP’; that is, no level constraint is applied, and so the term covers both VR and VPR. I now turn to the complexities that arise when a series of VPs is involved in VPR in Swiss German. I will show that the lack of a level constraint in (67) accounts precisely for a complex array of possible outcomes. The possible orders of a series of four verbs in which the lowest takes a direct object are listed in (68).
Inflectional Morphology
(68) a.
b. c. d. e. f.
227
das er [en arie singe] cho¨ne] wele] ha¨t that he an aria sing can want has ‘that he has wanted to be able to sing an aria’ N < V4 < V3 < V2 < V1 V1 NP V2 V3 V4 V1 V2 NP V3 V4 V1 V2 V3 NP V4 *V1 V2 V3 V4 NP (H&VR 1986, 428)
The verbs must all appear in Flipped order; the direct object can appear anywhere in the series except after the most deeply embedded complement. This patterning follows immediately from the stipulation in (67), coupled with the further stipulation that no verb that takes a direct object can take it on the right. (69) a. M,A, V b. V, NP The absence of a level constraint in (69a) corresponds in RLSþ to optional Reassociate; Flip is obligatory, so the verbs always appear in exactly reverse order (the reverse of (68b)). (70) a. b. c. d. e. f.
Reassociate at will. Flip all V < V nodes. (for complex as well as simple Vs) Flip no NP < V nodes. (for complex as well as simple Vs) V2 , NP V1 , V V1 þ V2 ! [V1 > V2 ]V , NP
The stipulation in (70b,c) that Flip is forced (or fails) for both complex and simple Vs taking direct objects follows from the RC, hence does not count as a separate stipulation. If a complex verb is formed by combining a modal or auxiliary with a transitive verb, the subcategorization of the transitive verb will be inherited, including any order restriction, as the RC dictates—(70f ) is the result of combining (70d) and (70e) with the RC. So the extra stipulation in (70c) is not part of the theory; rather, it is added for clarification. (70a–c) can generate all of the patterns in (68). (68d), for example, is derived by applying Reassociate followed by obligatory Flip.
228
Chapter 8
(71)
It is important to remember that Flip and Reassociate are not essential to the analysis; rather, they are just a way to think about CAT. The entire analysis is (69) by itself. A further consequence is that when the embedded verb has two arguments, they may individually appear anywhere among the set of reanalyzed verbs, so long as they do not exchange places; the verbs will be ordered among themselves exactly as in the one-argument case (68). (72) das er em Karajan1 en arie2 vorsinge3 cho¨ne2 wil1 that he (to) Karajan an aria sing-for can wants ‘that he wants to be able to sing an aria for Karajan’ (H&VR 1986, 434) (73) a. NP1 NP2 V1 V2 V3 b. NP1 V1 NP2 V2 V3 c. V1 NP1 V2 NP2 V3 d. V1 NP1 NP2 V2 V3 e. V1 NP1 V2 NP2 V3 f. V1 V2 NP1 NP2 V3 g. *. . . NP2 . . . NP1 . . . h. *. . . V3 . . . N . . . In order to treat these cases as CAT, we must have some means of representing verbs that take two arguments. We will adopt the ‘‘small clause’’ analysis. (74) [[NP < NP] < V] Given this, we can derive all of the patterns in (73) from the stipulations in (70). In terms of Flip and Reassociate, we can derive all of the patterns in (73) from (73a). For example, we can apply Reassociate to (73a) to derive (75a), and then apply Flip to derive (73f ); or we can apply Reassociate to (73a) to derive (75b) and then apply Flip to derive (73d); or we can simply not apply Reassociate, but then apply Flip to derive (73b). (75) a. ! [[[NP1 < [NP2 < V3 ]] < V2 ] < V1 ] Flipm [V1 > [V2 > [NP1 < [NP2 < V3 ]]]] (73f ) b. ! NP1 < [[NP2 < [V3 < V2 ]] < V1 ] Flipm [V1 > [NP1 < [NP2 < [V2 > V3 ]]]] (73d) c. [NP1 < [V1 > [NP2 < [V2 > V3 ]]]] (73b)
Inflectional Morphology
229
As in the previous example, Flip and Reassociate play no role in the analysis, which is completely determined by (69). CAT’s success in modeling V(P)R is considerable, and the evidence for the LVH is compelling as well. With very simple lexical stipulations about subcategorization of individual lexical items or classes of lexical items—mechanisms that surely no theory could forgo—we have succeeded in modeling V(P)R as described by H&VR, but without movement and without the novel mechanism of dual analysis that they believed necessary to describe the phenomena. If CAT is the appropriate model whenever lexical subcategorizations are played out in syntax, then it should come as no surprise that V(P)R shows CAT-like behavior. Other constructions where CAT should be applicable are noun incorporation, causatives, derivational morphology, and preposition stranding. But not wh movement. CAT is not Categorial Grammar as espoused by (among others) Bach (1976), Moortgat (1988), and Steedman (1996) in that it lacks type-lifting, the feature that makes it possible to embed descriptions of the broadest long-distance dependencies. 8.6
The Hungarian Verbal System
Hungarian has a verbal system very much like that of Germanic. It can be similarly modeled by CAT, but with one striking shortcoming. Traditionally, the positioning of the Hungarian verbal modifier (VM, to be explained below) has been modeled along with the rest of the verbal system. CAT cannot do this. CAT gives a simple and satisfying model of the verbal system minus the VM, capturing many of its very particular (but robust) properties. But when the CAT definitions needed to model the positioning of the VM are added to it, it overgenerates to the point that the model is useless, no longer predicting any of the interesting features. CAT is so restrictive that its failure to model a system is by itself informative, and so no cause for lament. But in this case the message is sharper: it suggests that, despite tradition, the positioning of the VM is independent of the verbal system. In the end I will o¤er reasons to think this is so. 8.6.1 The Verbal System without VMs I will quickly sketch the verbal system first without the VM, and then with the VM, noting the main generalizations. These generalizations
230
Chapter 8
represent a hard-won understanding of the system developed over a decade or so by Kenesei (1994), Szabolcsi (1996), Koopman and Szabolcsi (2000), and Brody (1997), among many others. Hungarian has a small series of optional ‘‘modal’’ verbs that occur in a clause in fixed interpretive order, just the sort of system CAT likes. (76) Nem fogok kezdeni akarni be menni. not will.1sg begin.inf want.inf in go.inf ‘I will not begin to want to go in.’ (Koopman and Szabolcsi 2000, 16) Ignoring the VM (be), each element in (76) has scope over all elements to its right. Furthermore, any reordering of adjacent elements results in ungrammaticality. From this, we can conclude that the following order holds: (77) nem > fogok > kezdeni > akarni > main-verb In its rigidity, and its rightward orientation, this system resembles for example the English auxiliary system, and in fact, Koopman and Szabolcsi (2000) refer to the order in (76) as the English order. I will adopt this term from them and use it to refer to the ‘‘head-first’’ order. It is of course the RLS. In addition to the order displayed in (76), Hungarian has a di¤erent— in fact, opposite—way to deploy the series in (77). (78) a. Nem [fogok > kezdeni > [[be < menni] < akarni]]. b. Nem [fogok > [[[be < menni] < akarni] < kezdeni]]. (Koopman and Szabolcsi 2000, 210) Importantly, the interpretive order of the elements in (78) is the same as in (76); that is, akarni always has scope over menni, for example, despite their being in opposite orders in (76) and (78). In other words, (78) represents di¤erent ways to realize the abstract structure in (77). The carets in (78) indicate the understood orders. The order of elements in (78b) I will call the compound order, as the head-complement order is that found in compound terms. Brody calls it the roll-up order, for good reason, as we shall see. The tensed verb and its complement are always in the English order. As the forms in (78) show, any given sentence with multiple auxiliaries will show a mixture of the English and compound orders. But there are strong constraints on the mixture.
Inflectional Morphology
231
1. The tensed verb cannot occur in a compound order. (79) a. fogok > be < menni < akarni < kezdeni b. *be < menni < akarni < kezdeni < fogok 2. Any compound structure must be at the bottom of the string of auxiliaries. (80) a. nem > fogok > kezdeni > akarni > be < menni b. nem > fogok > akarni > be < menni < kezdeni c. *nem > fogok > [akarni < kezdeni] > be < menni 3. The English order cannot occur inside a compound order. (81) a. fogok > be < menni < akarni < kezdeni b. *fogok > be < [akarni > menni] < kezdeni These three findings can be summed up in the following recipe for creating alternative orders for a given string of auxiliary verbs completely in the English order: beginning at the bottom, the bottom two terms can be compounded, or ‘‘rolled up’’; and this rule can be applied repeatedly, but not at the very top, where the tensed verb must be in the English order. This system is easily modeled in CAT. Since each auxiliary, apart from the tensed auxiliary, can appear on either side of its complement, each is ambiguous with respect to order; that is, each has both of the following subcategorizations: (82)
F, F
This by itself is not enough, because, with the RC, it will generate all of the ungrammatical orders in (79)–(81). (80c), for example, would count as grammatical, with exactly the parse indicated. To prevent this, we must also impose level constraints. There is some question what the relevant levels are; I will assume they are word and phrase (as the term compound in compound order suggests). Assuming further that the compound order is essentially lexical, and the English order is essentially phrasal, we have the following subcategorization: (83) Aux: F n , F 0 That is, each auxiliary takes a phrase of type F to the right, or a word of type F to the left. Furthermore, because the tensed auxiliary does not participate in the compound structures, it has strictly the left of the two subcategorizations in (83).
232
Chapter 8
(84) AuxT : F n I assume this is a further stipulation, as there is in general no ban on tensed verbs entering compound structures (e.g., English baby-sat). Then, given the RC, along with the assumption that words can head words, and words can head phrases, but phrases cannot occur in words, we predict some of the contours of the Hungarian system. The fact that the English order cannot occur in the middle of a compound follows from the fact that a phrase (the bracketed FP in (85)) cannot occur in a compound (marked here with { }). (85) *fogok > {[akarni > [be < menni]]FP < kezdeni} The fact that a compound cannot occur in the middle of a sequence of auxiliaries does not follow from the specifications in (83). (86) is a parse of such a case consistent with (83). (86) *nem > fogok > [akarni < kezdeni]Aux;
VP
> [be < menni]VP
In (86) akarni and kezdeni form a compound verb, where akarni has its VP-taking, rather than V-taking, subcategorization; that subcategorization is inherited by the compound, according to the RC. Although some speakers accept forms very much like this, I will assume that they are ungrammatical, and I will introduce the further specifications necessary to rule them out. The problem would be solved if akarni were prevented from using its VP-taking subcategorization when it was in a compound. This can be achieved by reconstruing the ambiguity of the auxiliary verbs in a slightly di¤erent way. Specifically, the principal ambiguity will be between rootand word-level forms for each of the auxiliaries, as in (87). (87) akarni: root, Froot word, F n That is, akarni is still ambiguous, but between the two levels root and word; roots enter into the compounding system, and words into phrasal syntax. Now (86) cannot be produced; only the root akarni can appear on the left of a compound, and only a further root subcategorization can be inherited by the compound. To allow compound structures to appear in syntax, we must allow roots to be reconstrued as words; once this is done, they can be used in syntax, but they cannot enter the compounding system again. But this is the classical relation between words and phrases.
Inflectional Morphology
233
While the ‘‘coding’’ in (87) may appear suspicious, it is really harmless, when one considers that if CAT is the model, the only way languages can di¤er is with respect to level, order, and type restrictions, and these restrictions are enforced in a rigid local fashion by X-bar inheritance and the RC. I suspect that the ambiguity in (87) occurs in English as well, with particle-verb constructions; that is, the relation between (88a) and (88b) is really a level-order ambiguity between root- and word-level identification of the particle itself. (88) a. John [looked up]V the answer. b. John looked the answer up. c. *John looked right up the answer. d. John looked the answer right up. e. the looking up of the answer f. *the looking of the answer up The lexical version of the particle excludes modification (88c), whereas the syntactic version allows it (88d). The lexical version nominalizes (88e); the lexical particle is ‘‘inside’’ the nominalization and therefore immune to the laws governing the form of NPs. The syntactic version does not nominalize (88f ); the syntactic particle is ‘‘outside’’ the nominalization, where it is excluded from NP on general grounds. I imagine this line of analysis could be applied to German separable prefixes as well. Finally, to account for the absence of tensed verbs inside compound structures, we require that T be represented only by a word-level element. In the reformulation this remains a separate stipulation. These stipulations exactly account for the Hungarian compounding paradigm, if the VM is excluded. Koopman and Szabolcsi (2000) seek a theory of clusters that involves only phrasal syntax and XP movement. They thus seek to avoid any reference to the lexical/phrasal distinction on which the analysis just given rests. Their theory thereby also distinguishes itself from any of the theories in which the roll-up structure results from X 0 movement, and VM fronting from XP movement. But on close inspection the relevant distinction can be found in Koopman and Szabolcsi’s account, just relabeled as ‘‘smallness’’ instead of ‘‘lexicality.’’ Smallness, never defined, has less intuitive content than lexicality, though it would seem to be extensionally equivalent to it, judging from the examples that Koopman and Szabolcsi give. But ‘‘smallness’’ leads to grave problems that ‘‘lexicality’’ does not have.
234
Chapter 8
What allows Koopman and Szabolcsi to contemplate the elimination of X 0 movement is that massive remnant movement makes it possible to simulate lexical movement by phrasal movement, as in the following derivations: (89) a. [XP YP ZP H]HP ! [XP [YP [ZP [tXP tYP tZP H]HP ]]] ! [tXP tYP tZP H]HP [XP [YP [ZP tHP ]]] ! b. [XP YP ZP H]HP ! [H [XP YP ZP tH ]HP ] The pair of movements in (89a) result in the same surface configuration as the movement in (89b). The movements in (89a) are first evacuation of everything in HP except its head, followed by movement of the remnant HP. The movement in (89b) is head movement. Koopman and Szabolcsi simulate the head clustering for verbs in the compound structure with the following condition: (90) When the specifier of VPþ is a small VM or an inverted sequence, VPþ optionally extracts from CP. Otherwise, VPþ cannot extract from CP. For reasons of space, I will not explain here how this principle interacts with the theoretical environment that Koopman and Szabolcsi provide to yield the constructions I have identified as lexical, or at least as involving nonphrasal heads; but see Williams, in preparation, for a full discussion. It is enough to see that lexicality is entering the system under the guise of smallness. I think this is a step backward from the general understanding of these constructions, in that it replaces a word with a relatively concrete meaning (lexical ) with one distinctly less concrete (small ). 8.6.2 The Verbal System with VMs I think that the fact that the RC with X-bar inheritance allows the behavior of the Hungarian verbal system, so complex at first glance, to be boiled down to (87) (with help from (84)) is an impressive result. The analysis is challenged, however, by the behavior of the VMs, which cannot be fit into the system without losing all predictions. The VM is a particle, or sometimes a short phrase, that is closely associated with the main verb, sometimes forming an idiomatic expression with it. The VM occurs either before or after the tensed verb, depending on features of the sentence in which it occurs. If there is a preverbal neg-
Inflectional Morphology
235
ative or Focus phrase, the VM occurs after the verb; if not, and if some other conditions are met, it occurs before the verb. (91) a.
Nem fogok be menni. not will.1sg in go.inf ‘I will not go in.’ b. Be fogok menni. c. *Nem be fogok menni. d. *Be nem fogok menni.
Be is a complement of menni; but in (91b) it occurs to the left of the tensed auxiliary verb. And in fact, an unbounded number of auxiliary verbs can appear between the particle to the left of the tensed verb and the verb of which it is a complement. (92) Be fogok kezdeni akarni menni. in will.1sg begin.inf want.inf go.inf The question is, what regulates the relation between these two positions? The ‘‘trigger’’ for the appearance of be in initial position has been argued to be phonological (e.g., Szendro˝ i 2001): the auxiliary verb needs ‘‘support,’’ if not from a negative or a Focus, then from a particle. I will assume that the trigger is an extrinsic constraint that CAT is not obliged to model. Even so, CAT fails. So far I have posited leftward root subcategorization for the compound order and rightward phrasal subcategorization for the English order. To generate (92), the CAT specifications must admit a third possibility— namely, that a sequence of words can realize the English order, as only words can transmit, via the RC, the lower verb’s need for the particle to the top of the verb chain. (93) a. Aux: F word b. menni: be c. be < [fogok > kezdeni > akarni > menni] If each auxiliary has a specification like the one in (93a), and the verbs taking VMs have specifications like the one for menni in (93b), then (92) will have a parse like (93c). There is in fact some circumstantial evidence in favor of treating VMs in this way. The verbs that enter into compounding relations with one another are approximately the same verbs that permit VM raising: uta´lni ‘hate’, for example, does neither. But the lists are not identical (K. Szendro˝ i, personal communication), so this consideration is hard to evaluate.
236
Chapter 8
But there are two problems with analyzing VMs in this way. First, (93) predicts that particle movement should be compatible with compounding, but it is not. (94) *Be < [fogok > kezdeni > [menni < akarni]]. Particle raising is compatible only with the pure English order, so any compounding interferes. From the point of view of CAT this is very odd, as other phrasal complements are compatible with compounding, which shows that compounding is transparent to a main verb’s subcategorization. For example: (95) Nem > fogom > akarni > [sze´t szedni < kezdeni] a ra´dio´t. not will.1sg want.inf apart take.inf begin.inf the radio This example shows that compounding of the main verb (represented by the bracketed sequence) does not prevent the main verb’s direct object subcategorization (sze´tszedni: NP) from becoming the subcategorization of higher constituents. If for direct objects, then why not for particles? Second, particles seem to be able to raise out of embedded CP complements under certain circumstances. For example: (96) Sze´t kell, hogy szedjem a ra´dio´t. apart must that take.subjunctive.1sg the radio ‘I must take apart the radio.’ (Koopman and Szabolcsi 2000, 211) Although such cases are quite restricted, the fact that they exist at all suggests that CAT is not the right mechanism to account for them. These two properties of VM positioning—opacity of the compound structures and nonlocality—both point to movement in the classical sense, rather than CAT inheritance. Compounds are always opaque to syntactic movement, but CPs are not. If indeed the VM is positioned by movement and not by the same sort of system that creates the verbal clusters, a sharp theory is needed to explain how a child would not be led astray by all the evidence that has misled linguists into analyzing the two phenomena as one system. CAT is just such a theory, because simple considerations unequivocably rule it out as a model of the VM, even though it is an obvious model of the verbal clusters. Another reason to implicate movement in the positioning of the VM is noted repeatedly by Koopman and Szabolcsi (2000): the VM can often be
Inflectional Morphology
237
a full phrase. This again is characteristic of movement, especially movement that bridges CPs. (97) [a szoba´-ba]PP menni the room-into go.inf ‘go into the room’ And, importantly, the VM cannot be phrasal when incorporated into a compound. (98) *[[a szoba´-ban]PP maradni] akarni the room-in stay.inf want.inf ‘want to stay in the room’ This example falls within the scope of the theory outlined in section 8.6.1: compounding involves X 0 s exclusively. (96) and (97) fall outside that theory. I think that CAT’s initial di‰culty in modeling the Hungarian verbal complex turns out to be its virtue: CAT has the grace to fail obviously and thereby to show where nature is jointed. Perhaps, as the last few points independently suggest, the Hungarian VM does not compose a homogeneous class of elements with the verbal particles after all. In light of our conclusions about Hungarian, we can return to the problem raised in chapter 7 about verb clusters in Czech and related languages; (99) repeats the facts from that discussion. Dal jsem mu penı´ze. give.prt aux.1sg him.dat money.acc ‘I gave him money.’ b. Tehdy bych byl koupil knihy. then aux.1sg was.prt bought.prt books.acc ‘Then I would have bought books.’ c. Byl bych tbyl koupil knihy. d. *Koupil bych byl tkoupil knihy. (Konapasky 2002, 246)
(99) a.
When there is a single participle, it can move to the left of the auxiliary. When there are two participles, the first can move to the left of the auxiliary, but the second cannot. With the Hungarian system as a model, we formulate the following restrictions: (100) a. aux: PartP b. Part 0 c. part: XP
238
Chapter 8
That is, auxiliary verbs can take a following participial phrase or a preceding participial stem; participles, on the other hand, always take an XP complement. When both an auxiliary and a participle are present, two structures are possible. (101) a. aux > [part > X]PartP b. [[part < aux] > X] (101b) corresponds to the possibility of (99a). When there are two participles, the following structures are possible: (102) a. aux > [part1 > [part2 . . . ]Part2 P ]Part1 P b. [part1 < aux] > [part2 . . . ]Part2 P c. *[part2 < [aux > part1 ]][ PartP] d. *[part2 < [part1 < aux][ PartP] ] The first participle can form a complex word with the auxiliary, and the result will have the subcategorization of the nonhead part1 and so takes Part2 P on the right. But there is no way for the second participle to appear on the left as in (102c), because the unit [aux > part1 ] will itself be phrasal and therefore cannot take a stem complement to the left. Similarly, (102d) cannot be formed because [part1 < aux], while a stem-level object, inherits its subcategorization from its nonhead (part1 ) and so can only take an XP to the right, not a participial stem to the left. In Czech, then, auxiliary verbs are just like the Hungarian cluster-forming auxiliary verbs, and participles are like Hungarian nonauxiliary verbs. These languages even have an analogue of Hungarian VM positioning. There is a general rule of XP topicalization (Rivero 1991, Konapasky 2002) that can fill the initial position, illustrated here in Serbo-Croatian. ˇ itao (103) [C knjigu]VP je Ivan tVP . read.prt book aux Ivan ‘Ivan had read the book.’ (Konapasky 2002, 244) If such an example were taken to show that aux had, in addition to (100a) and (100b), a subcategorization like the following, then, as in Hungarian, all sorts of unrealized possibilities would arise: (104) aux: XP Rather, as Konapasky (2002) shows, such phrases occupy the initial position by virtue of an entirely di¤erent process of XP topicalization.
Chapter 9 Semantics in Representation Theory
Two features of RT lead to revisions in the standard assumptions about how semantics is determined by syntactic form. One stems from the notion of derivation in RT. In the syntactic analysis of a sentence, there is no single structure that represents all of the information relevant to semantics; semantics then must be done over the whole set of forms that constitute the derivation and the matching relations that hold among them. The other stems from the fact that the shape-conserving matching that holds between levels does not always correspond to isomorphism, as we have seen in several cases, beginning with the bracketing paradoxes of chapter 1. To the extent that one end of such matches is semantic (or, more semantic than the other), they give rise to instances in which the system deviates from a strictly compositional system, the sort of system that is standardly assumed. In sections 9.1 and 9.2 I will briefly outline the issues involved in these two deviations, but without arriving at any firm conclusions, apart from what I have just mentioned; the discussion is provisional and speculative throughout, even by the standards of the previous chapters. The role of blocking in determining meaning will receive special attention, since, as pointed out frequently in this book, blocking is part and parcel of Shape Conservation: the most similar blocks all the less similar, all else being equal. In sections 9.2–9.5 I will explore, in the most preliminary possible way, how RT fares in analyzing certain problems connected with the formmeaning relation. In some cases I think an obvious advantage can be demonstrated; in other cases I can show no more than that a coherent account is possible. In section 9.2 I will illustrate the role of the blocking aspect of Shape Conservation in understanding the contribution of Case marking and the like. In section 9.3 I use the RT levels to index di¤erent sorts of focus. In section 9.4 I address some problems in ellipsis, and in
240
Chapter 9
section 9.5 I sketch how RT levels can be understood to index di¤erent kinds of NP interpretations. 9.1
Compositionality
9.1.1 Matching and Compositionality I take the interesting hypothesis about compositionality to be that it is strict—every phrase’s meaning is some strict function of the meaning of its parts; otherwise, the hypothesis does not say much. In what follows I will be talking about representations of meaning that have structure: representations that indicate the scopes of quantifiers, or that identify the thematic roles of NPs, or whatever else there is—some of the levels of RT. So I will be discussing translation of syntactic structures into some other kind of language, not real semantics, which relates sentences to the world. The question about compositionality then is one of compositional translation: is the translation of every phrase X strictly a function of the translation of its parts? In the compositional scheme we start with a syntactic tree in language A, and step by step, from the bottom up, we build some translation of that tree. We do this for every sentence in language A, thus deriving a second language B, consisting of all those translations. So B is whatever A translates to. But there is another way to think of translation. We can think of the languages A and B as both antecedently defined, and of the translation as a ‘‘matching’’ relation between them, one that matches to every sentence in the first language a corresponding item in the second language. Of course, compositional translation can be viewed as one particular kind of matching relation. In fact, if we require the matching relation to be absolutely ‘‘shape conserving’’—that is, if it matches up structures in language A with structures in language B, observing conditions on the identification of terminal elements across the two languages and respecting isomorphism of structure—then the matching kind of translation might be indistinguishable from compositional translation. In compositional translation, the bottom-up piece-by-piece building of the second tree based on what is found in the first tree will result in a tree that is isomorphic to the first tree, and so matching translation and compositional translation will always come out the same. But there is a circumstance in which these two notions of translation could diverge. For matching translation, we can think of the two lan-
Semantics in RT
241
guages that are being matched up as definable independent of one another, according to laws of form that might di¤er. In that case there might not be an isomorphic match in the second tree for every phrase in the first. This need not necessarily prevent the matching translation from being a complete translation. For example, if there is no isomorphic structure, the matching relation might pick the ‘‘nearest’’ structure as the translation—still shape conserving, but not strict. The translation will be fully defined, but it will diverge from a compositional translation for such cases. In the course of exposing RT in this book I have already presented cases like this, which support the idea that the translation is matching, not compositional. The cases of mismatch discussed in chapters 1 and 2 all have this character: in-situ quantifiers get wide scope by ‘‘mismatching’’ the structures at a later level, for example. But if this is possible in general, then the question becomes, what makes the translation look largely compositional? The answer has to be a combination of the fact that the matching relation that happens to be in use in language is shape conserving in the sense just mentioned, and the fact that the structures defined in the two sublanguages are largely similar. It is straightforward that if the two languages are fully isomorphic, then the result is indistinguishable from compositional translation. But if the structures defined in the two languages are only slightly divergent, then the discrepancies between the two results might be infrequent and localized. However, there is an interesting di¤erence between compositional translation and matching translation that goes beyond these discrepancies. In compositional translation, the translation of any given sentence proceeds on the basis of that sentence by itself. But in the matching theory, at least for the discrepant cases, the conclusion that b in language B is the ‘‘best match’’ for a in language A cannot be determined by looking just at a and b; instead, it must involve seeing what other structures are defined in languages A and B, insofar as there cannot be anything that is a better match to a than b is. In this sense the matching translation is ‘‘holistic’’: it matches the whole of language A to the whole of language B in a way that cannot be broken down into the matching of individual elements in A to individual elements in B. In linguistics, syntactic transformation has traditionally been the means of accounting for divergences of this kind, preserving compositionality. For example: how can we compositionally determine the thematic structure of the verb see when its direct object is moved many clauses away?
242
Chapter 9
Undo the transformation first; the transformation is responsible for the distorted picture of thematic structure in surface structure. But I have argued in specific cases (quantifier scope, heavy NP shift, scrambling, etc.) that movement is not the correct account; rather, it is interlevel holistic matching of structures. 9.1.2 Compositionality and the Place of Semantics in RT We can think of RT as involving two di¤erent representational situations. In one, a structure represents another piece of syntax, and in the other, it represents a piece of semantics. To take one example, SS representing CS is a case of syntax representing syntax, and SS representing QS (¼ TopS) is a case of syntax representing a semantic structure. To take another example, in chapter 2 I analyzed a particular kind of linguistic variation as arising from the way di¤erent languages resolve a conflict between a case of ‘‘structural’’ representation and a case of ‘‘semantic’’ representation. The formulas for English and German scrambling are these: (1) English favors SS c CS over SS c QS (¼ TopS). German favors SS c QS (¼ TopS) over SS c CS. And in both languages the possibility of SS c FS can neutralize the difference. From this mechanism I derived the fact that English requires elements following the verb to maintain a strict order that only focusing e¤ects can disrupt, whereas in German scrambling is obligatory, except in the face of some focusing e¤ects. The model implicit in the above discussion is not the linear representation model, but a model in which there are three levels that SS must represent, namely, CS, QS (¼ TopS), and FS. (2) CS ‘ SS c FS b QS (¼ TopS) In such a model we can talk about the competing representational requirements that these three peripheral structures place on SS. In what follows I want to bring the model back in line with the linear representation model, yet allow for the representational competition that the results in chapter 2 depend on. But at the same time I want to model certain other phenomena involving focus, which will make the model in (2) unworkable. Furthermore, I want to develop a sense of the gross architecture of the entire model, instead of simply adding a new level represented by SS every time a new descriptive problem presents itself.
Semantics in RT
243
The project begins with the previously mentioned, possibly indefensible, categorization of the levels into ‘‘semantic’’ and ‘‘syntactic.’’ In some linguists’ view, all representations (i.e., ‘‘structures’’) are syntactic. But some are intuitively more semantic than others: the representation that unambiguously displays the scope of quantifiers is more semantic than the representation that displays structural Case relations. But there is another way to describe the di¤erence between two kinds of representations: one kind lies directly on the path to spell-out, and the other kind does not. So, in the model that was the basis for the early part of the book, CS and SS were indubitably on the way to spell-out, and QS certainly was not, given the existence of in-situ ambiguous quantifiers in the output of English pronounced sentences. In this light, FS itself is a fudge. FS consists of at least the two different elements, ‘‘display of primary sentential accent’’ and ‘‘display of most salient new information.’’ These two di¤erent notions are clearly related—but how? We can begin to form a new model by identifying certain representations as ‘‘semantic’’: TS, QS, FS. These will not be on the main line to spell-out. The other representations will be: CS, PS, SS, and AS (Accent Structure, which displays the accent structure of the utterance). The main line from CS to AS will be a linear series of representation relations, as follows: (3) CS ‘ PS ‘ SS ‘ AS
(4) TS
?S
QS (¼ TopS)
!
!
!
!
We must also add the interpretive representations. Clearly, di¤erent syntactic levels are relevant for di¤erent aspects of interpretation; for example, AS is relevant for focus, but CS may not be. A simple scheme would be to associate each of the interpretive levels with one of the syntactic levels.
SS
‘ AS
CS ‘ PS ‘
FS
In general, representational conflicts at a given level will arise between the interpretive level and the structural representational demands on that level. Whether there are further conflicts will be taken up in the next section. This model permits the chapter 2 analysis of English and German, though now the analysis is cast in slightly di¤erent terms. English favors CS ‘ SS over QS (¼ TopS) ‘ SS, and German favors the reverse.
244
Chapter 9
Moreover, the e¤ects of focus can be factored in by taking AS c FS fidelity into account, in that it can tip the balance back to parity in the otherwise lopsided representational conflict. (Review chapter 2 for the empirical basis of the English/German di¤erence, and see section 9.3 for further analysis of the AS, SS, FS system.) The model in (4) suggests that in general, since each syntactic level represents both another syntactic level and an interpretive level, representational conflicts will arise between these two. Whether there are further sources of conflicts will be taken up in the next section. Although it is compatible with the findings of this book, and in fact makes them natural, the model in (4) raises questions about the linguistic representation of meaning. Each interpretive level is separate from the others, and there is no connection, no representation relation, between them. Each of them is an aspect of LF, in the usual sense, exactly in the sense that in RT each of the syntactic levels is an aspect of the syntax of a clause. But how are these di¤erent aspects related to one another? The theta structure of a clause will display the theta relations of the verb in relation to the verb, and the quantification structure will display the quantificational structure, but what is the relation between the two? One wants to know which argument of the verb is quantified in which way. To take a concrete example, consider a focused definite NP agent of a verb. Its agentivity is represented in TS, its definiteness in CS, and its focused property in FS, but how are all these facts related to each other? The obvious answer is representation. Although I have spoken of representation as relating whole structures to whole structures, in doing so it relates parts of structures to parts of structures. For example, an internal argument of V in TS will be mapped to a Case-marked accusative in CS, and so forth, all the way to a focused constituent in AS/FS. We can thus speak of an NP that is Case-marked, theta-marked, and focused only by taking into account all of these levels and how they are related to one another by shape-conserving representation. We can even define a relation between the theta role an object receives and its scope, even though these will not be in any direct chain of representation, because there will be an induced representation relation that holds between them, by virtue of the representations that the model does express directly.
‘
SS
‘
n
CS ‘ PS
n
n
n
(5) TS n ?S n QS (¼ TopS) n FS
AS
Semantics in RT
245
The representations symbolized by the long arrows are induced by the representations induced by the short arrows, in the fashion described before. Some NP in AS represents a focused NP in FS, and that NP represents an NP in SS, which represents . . . some NP in TS, and so there is an indirect relation between FS and TS, and also between particular NPs in FS and NPs (or whatever arguments are) in TS. For example, consider the following TS/QS pair, with the obvious head-to-head matches: (6) TS: [boy]agent [V [girl]patient ] QS: [some boy]QP [V [every girl]QP ] The natural isomorphism will match the agent in TS to the preverbal QP in QS; this will result in the further match between boy and boy. Some will not be matched, as it makes its ‘‘first’’ appearance in SS and QS. Boy occurs in both TS and QS; in TS it is agent, and in QS it is (head of ) a quantified NP. The full interpretation of [some boy] in QS and later levels will be some function of the interpretation of [boy] as agent of V. And so on. If the matching between levels were always isomorphic, then the induced isomorphism could be established directly, abridging the representation circuit. But owing to the existence of misrepresentation, the induced representation must make essential use of the chain of representation relations to establish the relation between TS and QS. But nothing is changed when mismatching occurs. Recall that English favors SS as a representation of CS over QS, and so surface structures with two quantification structures are ambiguous, in one instance mismapping the two Case structures by crossing. (7)
Here, as before, representation provides the relation between the quantified NPs and their images in TS. This aspect of semantics in RT is nothing other than the ‘‘higher equals later’’ correspondent of compositionality in standard Checking Theory practice. That is, representation replaces domination for functional
246
Chapter 9
embedding. For example, in RT an accusative represents a patient; in standard practice it functionally dominates it. 9.2
Blocking in Semantics
The blocking principle in general prevents multiple representation; that is, the following situation is not allowed: (8) *x ‘ ys1 ‘ ys2 (8) corresponds to the notion that ‘‘nature hates a synonymy’’—there cannot be a di¤erence in form without some di¤erence in meaning. If x is a concept and ys1 and ys2 are words, then (8) is the notion of synonymy that holds in the lexicon, and especially in inflectional morphology, where it is understood that variant forms like sneaked/snuck cannot coexist in the same grammar. The blocking principle, thus construed, has been shown to be an operative constraint on language acquisition (Pinker 1984); it is what drives out *goed. It can also, and perhaps thereby, be construed as a constraint on the form of a grammar. As ordinarily understood, the blocking principle does more than forbid synonymy; it says which of two forms is chosen to represent the given meaning—namely, the one more specifically tailored for that meaning. For example, were is the general form of the past tense of be, and was is the form specific to the 1st singular past; although both was and were are compatible with ‘‘1st singular past,’’ blocking dictates that only was can express that notion, being most specifically fitted to it. Although I do not think that the blocking principle is well understood, I nevertheless regard the principle of Shape Conservation to be a case of blocking in the sense just described. If ys1 and ys2 are both candidates to represent x in (8), and if ys1 is more congruent to x than ys2 is, then ys1 is ‘‘more specific to’’ x than ys2 is, and must be chosen to resolve the synonymy. In the simplest case (‘‘all else being equal’’) that should settle the matter. But in fact, since di¤erent representational levels are connected to di¤erent aspects of meaning, it is inevitable that blocking will not give a determinate answer to the question of which of two forms is to be used to represent, for example, a given theta structure. For this reason the role of the blocking principle in the present context is not straightforward. Such a principle is clearly required, but it is not clear what phenomena fall under it. For example, we have analyzed
Semantics in RT
247
HNPS as a case of ‘‘misrepresentation’’ between CS and SS, which exists alongside the ‘‘true’’ representation; so, restricting ourselves to CS and SS, (8) seems to be instantiated. But, as we saw in the discussion of HNPS, this ‘‘misrepresentation’’ is accompanied by di¤erences in interpretation at FS. So the blocking principle is upheld in the end, but in a wider context, one that includes FS. Constructions that look synonymous (the shifted and unshifted variants of HNPS cases) turn out to have different meanings at FS. But this raises the question, what di¤erences can count as di¤erences that license a representational synonymy? For, in the case of HNPS, the TS ‘ CS representation does display representational synonymy; it is only in a later representation that the focus-related di¤erence in meaning arises. So it is natural to ask, is there any limit on the ‘‘delay’’ that can occur between a representational synonymy and the di¤erence in meaning that rescues it? To put the question in concrete terms: Scrambling interacts with definiteness in German, and other semantic classifications, in ways analyzed in chapter 2. There, the Synonymy Principle was seen to be satisfied in a direct way, in that the structure of the example discussed always looked like (9). (9)
Suppose surface structure1 and surface structure2 are the scrambled and unscrambled representations of one and the same Case structure, as the diagram illustrates. We know that in German this situation is correlated with di¤erences in scope/topicalization aspects of interpretation represented in QS. So the CS ‘ SS representation involves ‘‘synonymy,’’ but the surface structures do not, as each surface structure in SS receives a di¤erent interpretation in QS.
248
Chapter 9
Now consider a di¤erent kind of case, one that in fact appears to model known phenomena. Suppose that two di¤erent Case-marking systems could represent one and the same theta structure, but with the same or a related di¤erence in meaning as in the case of German scrambling; in other words, scope, or specificity, or something else, turns on the di¤erence. In such a case the sign of the di¤erence in meaning would be ‘‘remote’’ from the representation of the di¤erence in meaning itself: a Case distinction would control a di¤erence in meaning two representations away, so to speak, as shown in diagram (10). (10)
The licensing of the TS ‘ CS synonymy is ‘‘delayed’’ until QS. On methodological grounds I suppose we should begin by disallowing such cases, in that we would then have a much tighter idea about the scope of the Synonymy Principle. With delayed licensing, we are saying that any di¤erence in meaning can license any di¤erence in form. Without delayed licensing, we can more narrowly specify how di¤erences in form and di¤erences in meaning are related to one another: only di¤erences in form and di¤erences in meaning that are in the same region of the model can interact in this way. The actual predictions would of course depend on the details of the model, but to take an extreme case, di¤erences in Case marking (at the early end of the model) could not correspond to di¤erences in information structure (at the late end). To illustrate with a concrete case, consider Swahili object agreement (OM indicates the object agreement a‰x). (11) a.
N-a-m-penda Juma. I-tns-om-like Juma ‘I like Juma.’ b. *Napenda Juma. c. N-a-ki-soma kitabu. I-tns-om-read book ‘I read the book.’ d. Nasoma kitabu. ‘I read a book.’
Semantics in RT
249
When the object is animate, object agreement is obligatory; but when the object is inanimate, it occurs only with definites. If object agreement is at the same level as Case assignment, then the pattern in (11) shows that Swahili agreement for indefinites has TS ‘ SC synonymy, not resolved until QS, if QS is where definites and indefinites are sorted out. This conclusion that delayed synonymy resolution is possible can be averted by structuring the model di¤erently. For example, suppose that QS represents the scope of quantifiers, as before, but that the definite/ indefinite distinction is established earlier—say, in CS. Then of course the TS ‘ CS synonymy is resolved on the spot, and the more narrow conception of how blocking enforces itself is possible. I do not find myself in any position to resolve the question of delayed licensing of synonymy. It is a question that does not translate easily into standard minimalist practice with Checking Theory, and so deserves further study in empirically distinguishing these two styles of modeling how semantics is determined by syntactic form. 9.3
Kinds of Focus
9.3.1 IFocus and LFocus The RT model just outlined provides an index to another set of related entities, the di¤erent kinds of focus. Several kinds of focus, or focusing e¤ects, have been cited in the literature: normal focus, contrastive focus, and the focusing that occurs in special constructions like pseudocleft, cleft, scrambling, and HNPS. I think this variety can be understood in terms of mismappings between levels. If we look at the right-hand side of the model as it now stands, we see several opportunities for mismatch. (12)
The mismatch between SS and QS (¼ TopS) has already been discussed in chapter 2, and nothing said here will change the conclusions drawn there. I will try to show that the way SS, AS, and FS relate to one another
250
Chapter 9
can account for the variety of focusing e¤ects and can allow them, despite their di¤erent properties, to be seen as part of a systematic whole. I will begin by drawing attention to an only partly appreciated dimension on which types of focus can be di¤erentiated. The discussion that follows depends on sorting them out clearly. One kind of focus generates a propositional presupposition—that is, a presupposition that some proposition is true. This sort of focus is found in the cleft construction, for example. (13) It was John who Bill saw. (13) presupposes that Bill saw someone. I will call this kind of Focus a Logical Focus (LFocus). I include in this type the answers to questions. When the answer to a question is a whole sentence, the ‘‘real’’ answer must be the focus of the sentence. (14) A: What did you buy in New York? B: I bought a RECORD in New York. B 0 : *I bought a record in New YORK. The question answer focus is often cited as the core case of normal focus. The other kind of focus is tied directly to the placement of main sentence accent, but it does not involve anything propositional. For example: (15) John wants a red hat and a BLUE hat. The ‘‘presupposition’’ generated by focusing on BLUE is just the word hat, and nothing bigger than that. One could try to extract a propositional presupposition from (15) (e.g., John wants an X-colored hat), but that is an artifact of the particular example and is not possible in general. (16) John compared the red hat to the BLUE hat. There is no proposition out of which BLUE has been abstracted in (16). Rather, BLUE is what I called a disanaphor in Williams 1997, and hat is its paired anaphor; the requirement is that the disanaphor be di¤erent from whatever stands in the same relation (‘‘ R !’’ in (17)) to the antecedent of the anaphor that the disanaphor bears to the anaphor. (17)
X 0 disanaphor
R ! R !
antecedent of anaphor ¼ anaphor
This is the Disanaphora Principle proposed in Williams 1997, where it is shown that the relation between hat and hat in (16) obeys general princi-
Semantics in RT
251
ples of anaphora. The accent pattern, and the accompanying anaphoric commitments, are essentially obligatory. (18) *John compared the red hat to the blue HAT. (The fact that (18) is not absolutely ungrammatical is a point to which I will return.) The most convincing examples showing that accent-induced Focus/ Presupposition structure has nothing to do with propositional presupposition comes from how telephone numbers are pronounced when they include repeated digits (M. Liberman, personal communication). (19) a. 258-3648 b. *258-3648 c. *258-3656 d. 258-3656 Here again the pattern is obligatory, so long as the speaker groups the digits in the usual way (3-2-2). Again, no propositional presupposition is raised. The anaphora involved here takes ‘‘same digit’’ as the identity condition in the domain in which that anaphora operates. As this kind of focus pertains to what has been called the information structure of a sentence, I will call it Information Focus (IFocus). I will associate IFocus and LFocus with di¤erent levels in RT. As LFocus for the pseudocleft construction involves wh movement, it cannot occur any earlier than SS, and I will assume that it is defined in SS (or the closely related QS). As IFocus involves the phonological accent pattern, it is plausibly associated with AS, which itself determines FS (Information Structure (IS)), resulting in the following diagram (the same as (12)): (20)
(QS ‘)SS ‘ AS(c FS) LFocus IFocus
We now have two notions of focus, so it is important to know how they are related to each other. The answer is representation. That is, in the normal situation, IFocus represents LFocus. Notice that the representation is not direct, but rather induced by the circuit. Given a sentence with a nontrivial LFocus in SS, how is it represented by AS? LFocus and IFocus are similar in an important way. Each breaks up a sentence into two parts: the Focus and the rest. We might suppose, then, that matching up the structures on this basis would be a part of the natural isomorphism between the two levels SS and AS, with the consequence
252
Chapter 9
that, in the normal case, the IFocus and the LFocus would be identified with each other. That is indeed what we find in the ‘‘unmarked’’ pronunciation of cleft sentences. (21) a. It was JOHN that Bill saw. b. *It was John that Bill SAW. It is also what we find in normal focus, as defined by question-answer pairs. (22) A: What did you buy in New York? B: I bought a RECORD in New York. B 0 : *I bought a record in New YORK. For the relation between IFocus and LFocus to be completely clear, the full details of AS—and for that matter QS (¼ TopS)—must be developed, and I will not do that here. I will make the smallest number of assumptions possible. That is, AS generates a set of accent structures, and in particular defines the notion ‘‘Accented Phrase’’ in a way that captures its central property: for English, it appears that the Accented Phrase can be any phrase that contains the main accent on a right branch. The fact that in a right-branching structure a number of di¤erent phrases will qualify is the phenomenon of Focus projection. (23) I [want to [see [the man [in the [red HAT]]]]]. Any of the bracketed expressions in (23) can be the Accented Phrase in AS, hence the IFocus in IS. The definition of Accented Phrase accounts for Focus projection. The IFocus in FS will canonically map to the Accented Phrase in AS. SS likewise defines LFocus in some manner. At the worst, certain constructions, like clefts and sentential answers to questions, are specified as determining an LFocus. In the natural isomorphism between the levels, LFocus ¼ Accented Phrase ¼ IFocus. Correspondingly, the LPresupposition of (21a) (Bill saw someone) and its IPresupposition (Bill saw t) are matched as well. What is odd about (21b), then, is that the IFocus is not identified with the LFocus. An odd sentence, but not a truly ungrammatical one—it simply has a very specialized use. We can use the machinery just developed to explicate that use. When the natural isomorphism holds between SS and AS, the pairings IFocus ¼ LFocus and IPresupposition ¼ LPresupposition result. The semantics is straightforward: the meaning of the IFocus is some function of
Semantics in RT
253
the LFocus, and the meaning of the IPresupposition is some function of the LPresupposition. But when the isomorphism is broken, as in (21b), these identities do not hold. Instead, for (21b) the identities are these: (24) It was John that Bill SAW. SS: LFocus ¼ John LPresup ¼ Bill saw someone PP: Accented Phrase ¼ SAW Rest ¼ it was John that Bill X IS: IFocus ¼ SAW IPresup ¼ it was John that Bill X The IPresupposition here includes both the LFocus and (part of ) the LPresupposition. It therefore cannot be identified with the LPresupposition—or, for that matter, with any other constituent in SS. Its meaning therefore cannot be (a function of ) the meaning of the LPresupposition, or the meaning of any subconstituent in SS. Rather, it must take the whole surface structure (with both LFocus and LPresupposition) as its value, but with the IFocus abstracted out. (25) [saw]IFocus [[John]LFocus [Bill Xed someone]LPresup ]IPresup (25) shows how AS represents SS, but without the natural isomorphism. It is because of the nonisomorphism that (21b) has such a specialized use. Normally, the LFocus is not IPresupposed. In this example it is; in fact, a particular LFocus:LPresupposition pair is IPresupposed. Under what circumstances would this be appropriate? Only if that LFocus : LPresupposition pair had occurred together in recent previous discourse. But that could really only be the case if something like (26A) preceded (21b). (26) A: It was JOHN that Bill heard. B: No, it was John that Bill SAW. The narrow circumstance in which this sort of IPresupposition is possible is what gives examples like (21b) their metalinguistic or ‘‘corrective’’ flavor. In ordinary terminology, the focus on saw in (21b) would be called contrastive focus and would be given a separate theoretical treatment, or at least the promise of one. But in fact, many of the things that are true of focus in general are true of contrastive focus as well, and there is therefore much to lose in not giving them a common account. For example, the rules for determining Focus projection are the same for both contrastive focus and normal focus, as the following examples show.
254
Chapter 9
(27) a. It was John that Bill SAW in the morning. b. It was John that Bill saw in the MORNING. c. A: What did you do to John? B: I SAW him. d. A: What happened? B: Bill saw John in the MORNING. In (27a) the contrastive focus is narrow, just as the normal focus is in (27c); likewise, in (27b) the contrastive focus is potentially broad, just as the normal focus is in (27d). Such parallels compel us to treat contrastive and normal focus by the same mechanisms, which include the identification of the IFocus and the relation of IFocus to the Accented Phrase. In addition, when a language has left-accented Focuses, as Hungarian does, the left accenting holds for both normal and contrastive focusing. But of course a di¤erence must be drawn somewhere. In the present scheme it is drawn in the relation of SS to AS, and specifically in the relation of the IFocus to the LFocus—when IFocus represents LFocus, we get normal focusing, when it doesn’t, we get contrastive. An important element in this explanation is that LFocus is subordinate to IFocus. This is shown by the fact that LFocus can wind up in the IPresupposition, but the reverse can never happen, because of how AS and SS relate to one another. In other words, it is not enough to say of (28B) that it has two Focuses. The following exchange will always be impossible: (28) A: JOHN saw Bill. B: *No, it was Sam that JOHN saw. Here, speaker B has attempted to correct speaker A, but has chosen the wrong focus strategy to do it: he has preserved speaker A’s main Focus as an Accented Phrase and has added his own correction as an LFocus di¤erent from the Accented-Phrase-defined Focus. A theory that assigns triggering features to Focuses does not thereby explain this particular asymmetry, even if it assigns di¤erent features to the two. RT distinguishes them by virtue of the asymmetric relation between levels and the fact that they are located in di¤erent levels. 9.3.2 Copular Inversion and Focus The apparatus developed here can unravel some of the intricacy of copular constructions. Copular sentences with two NPs show a complex
Semantics in RT
255
interaction among IFocus, LFocus, and referentiality. Such sentences usually have inverted and uninverted variants. (29) a. John is the mayor. b. The mayor is John. From small clause constructions, we know that one of these orders is more basic. (30) a. I consider John the mayor. b. *I consider the mayor John. I will assume that the ‘‘narrower’’ term (John) is the subject of the sentence in some sense of subject relevant to a level prior to SS or to SS itself, the earliest level in which LFocus and IFocus are defined; I will then refer to the order in (29a) as the subject order (see Williams 1997 for fuller discussion, but in a di¤erent theoretical context). Both (29a) and (29b) are grammatical with final accent; however, they diverge if the accent falls on the initial NP. (31) a. JOHN is the mayor. b. *The MAYOR is John. Like some previous examples, (31b) is not ungrammatical; rather, it is restricted to ‘‘corrective’’ contexts. We may gain some understanding of (31) if we assume that the order in (31a) is the subject order, but the order in (31b) is not. Then the pattern in (31) is just the familiar pattern we have seen for HNPS, and the logic of (31) is, ‘‘Invert to deliver a canonical (final) Focus, but not otherwise.’’ The two orders show a surprising further di¤erence in relatives and questions. (32) a. I b. I c. I d. *I
wonder who is the mayor? wonder who the mayor is? met the person who is the mayor. met the person who the mayor is.
The intriguing contrast is (32b) versus (32d): since both involve wh movement, it seems unlikely that the di¤erence has to do with movement per se. Also, both have noncanonical (nonfinal) IFocuses, so the answer does not lie there either. But two plausible suppositions will su‰ce to explain the di¤erence in the context of RT. First, suppose the inverted subject must be an LFocus;
256
Chapter 9
and second, suppose that questions, but not relatives, have LFocus ‘‘pivots’’ (wh words). Then (32b) ‘‘compensates’’ for noncanonical subject order by establishing a canonical LFocus; but in (32d) there is no corresponding compensation, so the noncanonical subject order is unmitigated. There is some evidence for both of the suppositions needed in this explanation. First, questions do seem to raise a propositional presupposition of exactly the sort that would be given by identifying the pivot as the LFocus. That is, (33a) seems to presuppose the truth of (33b). (33) a. Who did you see? b. You saw someone. Second, there is some di¤erence in the presuppositions for inverted and uninverted copular sentences, which I think the following examples bring out: (34) a. Bill thought that John was the mayor, but in fact the town had no mayor. b. ?Bill thought that the mayor was John, but in fact the town had no mayor. That is, the inverted form seems to carry a presupposition, ‘‘the mayor is somebody,’’ which the uninverted form does not carry. (For further discussion, see Williams 1998a.) 9.3.3 Spanish LFocus Having made the distinction between IFocus and LFocus, let us return to a problem alluded to in chapter 2. It has often been noted that ‘‘answers to questions’’ in Spanish are obligatorily clause final. (35) A: Who called? B: *JUAN llamo´ por tele´fono. JUAN called (Zubizarreta 1998, 76) 0 B : Llamo´ por tele´fono JUAN. Even to discuss the problem, we must distinguish normal focus from contrastive focus, because contrastive focus in Spanish is not subject to the limitation just illustrated. However, distinguishing them risks losing an account of all they have in common, as noted earlier in this chapter: they have anaphoric commitments of the same kind, they both carry nu-
Semantics in RT
257
clear stress internally, and so on. The IFocus/LFocus distinction allows us to treat them separately without abandoning a common account of the phenomena just described. First, we will need to assume one of the conclusions reached earlier: that questions, and their answers, involve LFocus of the answer, for reasons already given—a question generates an LPresupposition, and the response to the question carries forward the LPresupposition of the question and substitutes the answer for the wh phrase as LFocus. Now we may begin to approach the question of how Spanish focus works. First, why must the answer to a question, which we have identified now as the LFocus, be clause final in Spanish? We have already assumed that in the canonical SS ‘ AS representation, the LFocus is mapped to the IFocus. Let us further assume that the SS Focus is rightmost. The rightward positioning of the LFocus in SS arises from the requirement that SS match QS; in other words, we will assume that the rightness requirement originates in QS and propagates to SS under Shape Conservation. Rightness of LFocus will be enforced to the extent that SS c QS is enforced. In particular, if SS c QS supersedes SS c PS, then LFocuses will appear in a rightward position, if possible. Let us suppose that Spanish is such a language. Then we do expect the behavior in (35): if the LFocus can be rightmost, then it must be rightmost. But other predictions are generated as well. First, the LFocus will appear on the right only if the syntax allows it. Since subjects can be postposed in Spanish, rightward positioning of LFocused subjects is possible. But, as we saw in section 2.7, there are situations in which such postposing is impossible. (36) A:
Con quien llegaron enferma? with who arrived sick ‘Whoi did he arrive with sicki ?’ B: Llegaron con MARIA enferma. B 0 : *Llegaron enferma con MARIA.
Example (36) is significant in sorting out theoretical treatments of focusing e¤ects. In RT (36B) is grammatical precisely because (36B 0 ) is not. (36B 0 ) is not grammatical because it is not an available structure in the relevant level of representation (SS in the present context). Therefore, (36B) is the closest match to the quantification structure, and so even though it mismatches on the positioning of the Focus, it is the best match, hence grammatical (though it is judged slightly worse than a ‘‘normal’’
258
Chapter 9
answer in which postposing is possible). So, the best match wins, even when the best match is a bad match. I must stress that I do not have an account for why postposing is not allowed in these cases, only for why nonfinal Focuses are acceptable when postposing is not allowed. In a Checking Theory account of such structures, (36B) is mysterious. If there is a focus feature that must be checked in Spanish, resulting in obligatory postposing of the subject, why is that feature not left unsatisfied in (36B), making the sentence ungrammatical? That is, the grammaticality of (36B) cannot be understood in terms of the ungrammaticality of (36B 0 ), because an unchecked feature is an unchecked feature. In RT, English di¤ers from Spanish in two ways. First, English does not allow subjects to be postposed. I assume this is due to a di¤erence in the constitution of SS (or PS). Second, English does not favor FS c SS (or derivatively FS c QS), in that LFocuses are tolerated in nonfinal position, as we saw earlier. I assume these are independent di¤erences between the two languages. If so, then there is room for other language types—specifically, for a language that strongly favors LFocus in rightmost position, but without subject postposing. Such a language would treat subject LFocuses in the same way that English does, that is, in situ; but in the VP, where reordering is possible, not putting the LFocus in final position would be sharply worse than in English. French might be such a language. The second thing to understand about Spanish is why the rightward positioning requirement is not imposed for contrastive focus. In short, because contrastive focus does not involve an LFocus. The LPresupposition is a presupposition of truth, and, as we saw in section 9.3.2, it is not relevant to the general case of contrastive focus. (37) I prefer the red book to the [BLUE]IFocus book. The same notion of IFocus is applicable to both contrastive and normal focus, but the rightward positioning requirement for answers stems from the syntax of LFocus in SS, not from IFocus, and so has no e¤ect on examples like (37) or like Zubizarreta’s (1998, 76) (see (57) in chapter 2). (38) JUAN llamo´ por tele´fono (no PEDRO). JUAN called not PEDRO Here the Focus is an IFOCUS and is not extraposed even though extraposition is possible. The focusing in (38) involves no truth presupposition, insofar as saying JUAN called does not presuppose the truth of someone called. It presup-
Semantics in RT
259
poses that x called has occurred in the discourse already; but that is nothing more than to say that x called is an anaphor, not that it is true. (39) Mary didn’t call; but JUAN called. The anaphor called is licensed by Mary didn’t call, even though that clause explicitly denies that Mary called and gives no indication that anyone else did. 9.3.4 Hungarian Focus As we saw in chapter 2, Hungarian focus structure is Focus initial, in that the Focus precedes all nontopicalized clause elements, including the subject. (40) Hungarian focus structure Topic . . . Topic F [V . . . ] Hungarian di¤ers in this way from the languages we have considered so far—English, Spanish, Italian. If this is correct, then Hungarian di¤ers parametrically in how it structures one of the levels (FS), which tells us that the levels themselves are not fully fixed universally. We will see that languages can vary in two ways: not only in which representation relations they prefer over others, as in chapter 2, but also in how the levels themselves are structured. RT will then di¤er from other theories in having a nonuniform source of variation—Checking Theory reduces all variation to strength of features, Antisymmetry reduces all variation to remnant movement; Optimality Theory reduces all variation to reordering of constraints. For some this might be enough to put RT out of the running, but surely that conclusion is premature. Hungarian di¤ers from English in another way: the Focus itself must be initially accented. In (41) the Focus can be any of the underlined constituents. (41) Ja´nos [a
TEGNAPI cikkeket] olvasta.
Janos the yesterday’s articles read ‘Janos read yesterday’s articles.’ (Kenesei 1998, as reported in Szendro˝i 2001) Hungarian and English thus di¤er on two parameters: Is neutral Focus position on the left or the right? and Is the focused constituent left
260
Chapter 9
accented or right accented? If it turns out that all languages are of either the Hungarian or the English type, then I will be deeply embarrassed, as I have constructed a theory in which there are four possible language types, including as well, for example, languages where the left-accented Focus occurs on the right periphery, and the reverse. (42) a. [ . . . [Accent . . . ]] b. [[ . . . Accent] . . . ] I frankly cannot think of a natural scheme to tie these two parameters together as one. In RT in particular it would be di‰cult to coordinate them, as they govern di¤erent levels: the accent placement parameter governs AS, and the left versus right placement of Focus itself is a feature of FS. For these reasons I hope the two parameters do not turn out to be linked empirically. Furthermore, there is a little evidence, from English and symmetrically from Hungarian, suggesting that they are independent. Both English and Hungarian have nonperipheral Focuses, and those Focuses are accented like their peripheral counterparts. (43)
In Hungarian, noninitial Focuses are allowed only as second Focuses, as single Focuses must move to initial Focus position. These examples show that the internal placement of the accent is independent of whether the Focus is peripheral or not, suggesting that the internal placement is independent of the external distribution and in turn that languages with the parameters set as in (42) are to be expected. In sum, then, it appears we might say that universally, (a) Focuses are either left or right accented, as a part of the definition of the Accented Phrase in AS; (b) the principal constituent of FS is located either leftperipherally or right-peripherally in the structures defined there; and (c) the AS is mapped to the FS under Shape Conservation.
Semantics in RT
9.4
261
Ellipsis in RT
In section 9.3 we had call to identify the complement of an IFocus as an ‘‘anaphoric’’ IPresupposition. In fact, IPresupposition is a poor term, since, as shown there, there is no presupposition in the sense of a proposition with a truth value. Finding anaphora operating in AS, in the form of destressing, suggests revisiting the theme of chapter 4, where it was shown that di¤erent reflexive anaphors occupy di¤erent RT levels, with predictably di¤erent properties. Are there other kinds of anaphors that can be ‘‘indexed’’ according to the RT levels? A good candidate is the family of ellipsis rules. English and other languages display several kinds of ellipsis, with puzzling di¤erences in behavior. I think some of these properties, particularly involving di¤erences in locality, can be explained by locating them at di¤erent RT levels. English, as well as other languages, has an ellipsis rule that deletes everything but a single remnant constituent. (44) John wants to build Mary a tree house on Friday, and 9 8 Samnom ; too > > > > < Sam ; too = acc . > > > a co‰n; too > ; : on Sunday; too Although (45) is potentially ambiguous, given a particular focus, its interpretation is fairly well fixed. (45) Bob saw BILL, and Pete too. ¼ and Bob saw Pete 0 and Pete saw Bill This is exactly what we would expect if the construction in question were interpreted in AS. The interpretive layer of AS (FS) partitions a sentence into IFocus and IPresupposition, and it is the IPresupposition, and only the IPresupposition, that is used as the antecedent. For this reason I will refer to this kind of ellipsis as Focus ellipsis. Fixing Focus ellipsis at AS—that is, very late—suggests that it will be highly nonlocal. In particular, it suggests that the ellipsis site itself can span CP boundaries, which does indeed seem possible (elided elements are struck through). (46) Someone thinks that Bill likes fruitcake, and Someone thinks that Pete likes fruitcake too
262
Chapter 9
VP ellipsis presents quite a di¤erent picture. VP ellipsis seems intrinsically bound up with the notion of subjecthood we have associated with PS: the elided material is always interpreted as a predicate that takes the remnant of ellipsis as its subject. (47) Sue likes oats in the morning and John does too. Since VP deletion is licensed (first) in PS, we would expect it to be immune to the identification of the Focus, and this seems largely true. John saw MARY (48) and then BILL did too. JOHN saw Mary The anaphora is compatible with any choice of Focus. Not only can the main accent be located anywhere; in addition, wherever it is, Focus projection is possible without a¤ecting the interpretation of the ellipsis. Again, this is what would be expected if VP ellipsis were adjudicated in PS, before AS. Moreover, the availability of ‘‘strict’’ versus ‘‘sloppy’’ readings does not turn on focus structure, as the following examples show: (49) a. JOHN likes his mother, and so does BILL. b. John likes his MOTHER, and so does Bill. c. i. Bill likes Bill’s mother (sloppy) ii. Bill likes John’s mother (strict) Both (49a) and (49b) have both readings in (45), despite having di¤erent accent structures. (See Williams 1974 or Fiengo and May 1994 for accounts of the strict and sloppy readings.) What is invariant about VP deletion is the relation of the ellipsis to what remains. The VP is a predicate on the subject that remains, and it is on the basis of this that the strict/sloppy readings are sorted out—the ambiguous pronoun bears an ambiguous relation to the subject. Focus ellipsis bears the relation IPresupposition to the IFocus that remains undeleted; therefore, in both cases the target of the ellipsis is appropriate to the level at which it takes place. Focus ellipsis also shows strict/sloppy identity ambiguities. (50) a. Sam told JOHN to buy his mother a present, and PETE as well. b. i. Sam told Pete to buy John’s mother a present ii. Sam told Pete to buy Sam’s mother a present Appropriately, the ambiguity lies in how the pronoun relates to the remnant of the ellipsis, in this case, the Focus; as a result, all else being equal,
Semantics in RT
263
Focus ellipsis behaves in a way parallel to VP ellipsis. For both VP ellipsis and Focus ellipsis, we can imagine the sort of account put forward in Williams 1974, wherein the deleted material bears an ‘‘abstraction’’ relation to the remnant material. In the case of VP ellipsis the abstraction is the abstraction inherent in the subject-predicate relation; in the case of Focus we can easily imagine that the same kind of abstraction is involved. (51) a. John lx (x likes his mother) b. John lx (Sam told x to buy his mother a present) Then in both cases the ambiguity will lie in whether the pronoun takes as its antecedent the lambda variable x (for the sloppy reading) or the argument of the lambda expression, John (for the strict reading). The result so far is that the interpretation of the ellipsis, and in particular the behavior of the strict/sloppy ambiguity, turns on structures needed independently: the articulation into subject and predicate in PS for VP ellipsis, and the articulation into Focus and Presupposition for Focus ellipsis. However, this pretty picture is marred somewhat by the existence of speakers who accept a wider class of sloppy readings for VP ellipsis. The following sort of case is reported by Fiengo and May (1994): (52) a. John’s father thinks that he will win, and Bill’s father does too. b. i. Bill’s father thinks that John will win (strict) ii. Bill’s father thinks that Bill will win (sloppy) Fiengo and May develop a theory of sloppy identity that depends on a general notion of ‘‘parallelism’’ that must hold in ellipsis sites; the sloppy interpretation arises here because the relation between John’s and he in the first clause of (52a) is structurally parallel to the relation between Bill’s and Bill in (52bii). The sloppy readings for examples like (52) are, I think, only marginally available, and not at all for some speakers. But the mystery remains: where do they come from? I think the focus structures of the examples shed some light on the situation. Importantly, the success of sloppy ambiguity that turns on antecedents other than subjects depends completely on focus structure, as the following examples show: (53) a. b.
John’s father thinks he will win, and BILL’s father does too. John’s father thinks he will win, and Bill’s MOTHER does too. 0 Bill’s MOTHER thinks Bill will win c. *John’s father thinks he will win, and BILL’s mother does too.
264
Chapter 9
(53b) does not have a sloppy reading, the one indicated beneath it. This is clearly the result of Bill’s not being the Focus of the second clause. (53c) simply shows that given the context, BILL could not be the Focus, because of the disanaphora conditions on focusing discussed earlier. Two points will clarify the situation. First, for some speakers it appears that sloppy identity for VP ellipsis is being licensed in exactly the manner of Focus ellipsis: sloppy identity can turn only on the Focus. That is, the ellipsis is being licensed by a structure that looks like this: (54) BILL lx (x’s mother [thinks he will win]) This is a structure that arises in FS, not PS. So we might conclude that for some speakers the sloppiness can arise in FS, not PS. This will also explain why (53b) does not have a sloppy reading; it does not qualify for one in PS, because the sloppiness does not turn on the subject, and it does not qualify for one in FS, because the sloppiness does not turn on the Focus. So we may account for the phenomenon in (53) by supposing that for some speakers VP ellipsis is licensed in FS, instead of (or actually, in addition to) PS. The most compelling reason that this picture must be essentially correct is that even for speakers who allow Focus-anteceded sloppy identity for VP ellipsis, focus plays no role when the licensing is subject-anteceded. This can be verified in examples already given; for example, (53a,b), which are repeated here, both have valid sloppy interpretations in which the antecedent for the pronoun is Bill’s mother (the reading indicated in (55c)). (55) a. John’s father thinks he will win, and BILL’s father does too. b. John’s father thinks he will win, and BILL’s MOTHER does too. c. Bill’s mother thinks that Bill’s mother will win. What this means is that all speakers have access to the ‘‘core’’ case of VP licensing—the one found in PS, where only subjects antecede elided material, and where variations in focus structure play no role in the availability of antecedents. So focus-based variation arises only when the licensing takes place at FS. Now let us apply this methodology to other ellipsis rules. English has another ellipsis rule called gapping, a stylistically somewhat formal rule. Gapping seems restricted to coordinated IPs; at least, that is what the following paradigm suggests:
Semantics in RT
265
(56) a. I think that John saw Mary, and Mary John. b. *I think that John saw Mary, and that Mary, John. This restriction suggests that gapping is defined on the level at which IPs are defined, but not CPs—in other words, on something like PS. If that is so, then gapping should be bounded by CPs not only as shown in (56), but also as shown in (57). (57) a. John thinks that Sue bought a dog, and Pete, a cat. b. John wants to buy a dog, and Pete, a cat. c. John wants Sue to buy a dog and Pete, a cat. (57a) is grammatical, but it cannot mean ‘. . . and Pete thinks that Sue bought a cat’; that is, the ellipsis cannot bridge the tensed complement structure, but must be contained entirely within it. (58) a. *[John thinks that Sue bought a dog] and [Pete thinks that Sue bought a cat]. b. John thinks that [Sue bought a dog] and [Pete bought a cat]. c. John wants Sue to buy a dog and Pete, wants Sue to buy a cat. d. John wants Sue to buy a dog and Pete wants a cat to buy a dog. The restriction follows if gapping is restricted to PS, where CP structure has not yet been introduced. Of special interest is (58c), as the embedded clause has a subject, but is not tensed. (58c) is slightly more di‰cult to parse in the manner indicated. In fact, a di¤erent reading interferes, the one indicated in (58d) (see Hankamer 1973 for discussion). But most speakers accept (57b), particularly if the pause is made especially prominent. If these discriminations are correct, they strongly confirm the framework that predicts them. To summarize the prediction: from a fact about the context in which gapping takes place (56), we infer the discriminations in (57) and (58), discriminations we have no right to expect in the absence of RT. In all of the discussions of locality so far, I have given cases in which the ellipsis slices into the complement—that is, deletes part of it. But then what about cases in which the ellipsis includes the whole of the complement? (59) John said [that he was leaving]CP on Monday, and Bill said [that he was leaving]CP on Tuesday.
266
Chapter 9
In (59) an entire CP has been gapped along with the verb. But how is that possible, if gapping occurs at a level where CP has not yet been introduced? The answer must be something like this. At the point at which the gapped structure is assigned an antecedent, which I will continue to suppose is IP, the full CP structure has not been introduced in the antecedent VP, but the gapping rule nevertheless establishes the antecedent relation between the two VPs. (The relation is indicated here by coindexation.) (60) John [said that]VPi on Monday and Bill [e]VPi on Tuesday. At a later stage—say, SS—the full tensed CP is filled into the complement position in the first clause. (61) John [said [that he was leaving]CP ]VPi on Monday, and Bill [e]VPi on Tuesday. In the resulting structure [e]VP will be understood as having the whole VP as its antecedent, including the CP. Under this arrangement the rule licensing the gapped material does not have access to the CP structure; but it does not need to have that access. Therefore, it will still be impossible to delete a proper subpart of a complement CP. The final ellipsis rule I will consider in connection with the RT levels is sluicing. Sluicing is triggered by the presence of wh phrases, so it is inevitable that it is licensed in SS, the level in which wh is defined. (62) John likes someone, but I don’t know who [John likes t]. Given that sluicing is licensed in a structure in which CP has been introduced, we expect that it can slice into CPs, and this appears to be so. (63) John thinks that Mary will lie to someone, but I don’t know who John thinks [that Mary will lie to t]. The residual preposition guarantees that the embedded clause has been sliced into, and not simply deleted as a whole, which (as we saw in the case of gapping) is irrelevant to evaluating locality. 9.5
The Semantic Values of Elements in RT Levels
An NP in TS corresponds to a pure theta role; an NP in higher levels corresponds more and more to what we think of as a full NP—referential, quantificational, and so on. An NP in CS is a Cased NP; pre-
Semantics in RT
267
sumably it is here, and possibly in later levels, that expletives enter. We can then talk about the ‘‘history’’ of an NP as the series of objects at different levels that are put in correspondence under the isomorphic mapping that relates the levels to one another. (64) TS: [dog . . . ] ‘ CS: [dognom . . . ] ‘ SS: [[every dog] . . . ] The sequence dog, dognom , every dog is established by Shape Conservation. Since presumably every NP has an image in every level, it might at first seem di‰cult to distinguish the di¤erent levels. But in fact I think that anaphors, as described in chapter 4, can give us some insight into the di¤erences between the levels. Recall that anaphoric bindings are a part of what Shape Conservation carries forward from one level to the next, so that a coindexation (or its equivalent) established in an early level will persist in later levels. (65) TS: [dog i likes himself i ] ‘ CS: [dognomi likes himself i ] ‘ SS: [[every dog] i likes himself i ] If the anaphor is assigned its antecedent in CS (for concreteness), then that assignment is carried forward to SS by the Shape Conservation mapping. Put in terms used in earlier chapters, the ‘‘antecedent’’ relation commutes with the representation relation, in that, given an anaphor, the image (under shape-conserving mapping) of the anaphor takes as its antecedent the image of the antecedent of the anaphor. But an anaphoric relation established in an early level may ‘‘mean’’ something di¤erent from an anaphoric relation established in later levels; at least, that is what I will tentatively suggest in what follows. For example, an anaphoric binding in TS binds two theta roles together —two coarguments, or, as suggested in chapter 4, perhaps a somewhat broader notion. One cannot coherently say that the two theta roles ‘‘corefer’’ since reference, in the sense of that property which, for example, definite NPs have, is not a concept at that level. Theta roles in TS are the actors, patients, and so on, that are the arguments of predicates, and coindexing two theta roles says that they are ‘‘the same’’—that is, ‘‘identified.’’ This will translate into coreference in a later level—specifically, in whatever later level the relevant notion of reference is operative. This at
268
Chapter 9
least tells us that split antecedents are impossible at this level, as splitting an anaphor implies some kind of substructure, and theta roles themselves are indivisible at TS, that is, atomic. Later coindexings might be liable to split antecedents, as at least the full notion of reference will have to allow the sorts of relations that have been referred to as coreference, overlap in reference, subsumption of reference, disjointness of reference, and so on, and therefore will clearly allow the sort of structure that would support split antecedence. By this thinking, then, we arrive at the notion that early anaphors will not allow split antecedence, but late anaphors will. This will be more than a way to simply classify anaphors, as we now know some things about the behavior of early and late anaphors: early anaphors will display sharp locality restrictions, will have a limited set of admissible antecedents (in the A/A sense), and will always be transparently reconstructed for by movement and scrambling relations. If it turns out that these things also correlate with the possibility of having split antecedents, then that becomes a strong cross brace in the empirical underpinning of RT. I have not carried out the broad empirical survey that would deliver a sound decision on this speculation. It would be relevant to know, for example, whether long-distance uses of Japanese zibun allow split antecedents. But there is one suggestive indication that the correlations are exactly as expected. It is well known that English clausemate, coargument antecedents are not allowed to be split, and as I have already suggested, these are CS or possibly TS anaphors, on the grounds of locality and reconstructivity. (66) *John i told Maryj about themselves[i; j] . This fact is certainly consonant with my proposals; indeed, if it were false, it would call into serious question the premise on which I am basing the further predictions in this section. At the other end of the scale are anaphors of the kind discussed by Reinhart and Reuland (1993); as determined in chapter 4, these are defined at a late stage in the model, on grounds of their lack of locality. (67) John told Mary that at least Bill and himself would be there. The question then is, can these anaphors be split? The following example is relevant: (68) John i told Maryj that at least Bill and themselves[i; j] would be invited to the party.
Semantics in RT
269
If the judgment discriminating (66) and (68) is reliable, these examples are encouraging, because in the absence of RT, there is no particular reason that locality and target type should correlate with the possibility of split antecedents. If these two types of anaphors di¤er in this way, then we would expect them to di¤er in reconstructivity as well: anaphors that do not allow split antecedents would reconstruct, and anaphors that do allow split antecedents would not. Although the following examples are the right kinds of examples to make the point, I think they are complex enough that firm judgments are not available; consequently, although the marks in (69) do correspond to my own judgments, perhaps they should be read as only the ‘‘predicted’’ judgments. (69) a. What John i saw t was himself i dancing in the street. b. *What John i told Maryi that he saw on TV was Bill and themselves i dancing in the streets. Of course, in order for (69b) to be relevant at all, it must be determined that reconstruction is necessary in the first place; if the surface, unreconstructed configuration of the anaphor and its putative antecedents is valid, then (69b) would be irrelevant to the question of reconstruction. But I think the following example establishes that something like ccommand is necessary even for these sorts of reflexives: (70) *Exactly when John told Mary to leave, I saw Bill and themselves dancing in the streets on TV. Controllable (null) subjects are like anaphors in dividing into two sorts, one allowing splitting, and the other not; the former are traditionally called obligatory control cases, and the latter, non–obligatory control cases. Obligatory control cases take determinate local antecedents; non– obligatory control cases take ‘‘arbitrary’’ and ‘‘inferred’’ antecedents. As suggested in chapter 3, it is very likely that these two sorts of control correspond to di¤erent ‘‘sizes’’ of infinitives. Wurmbrand (1998) has documented that this is the case in German. Applying the same reasoning used earlier, the RT expectation is that the ‘‘smaller’’ the infinitive, the earlier the control relation is established, and the less possibility there will be for split antecedence. Again, some very clear cases suggest that this is so. As discussed in chapter 3, no infinitive that clearly takes CP structure shows the properties of obligatory control; likewise, such infinitives show split antecedents.
270
Chapter 9
(71) a. Non–obligatory control John i told Maryj [how [PRO][i; j] to save themselves]CP . b. Obligatory control *John i promised Maryj [[PRO][i; j] to save themselves]CP . (I have included a reflexive in both cases to guarantee the relevant construals for the examples. The reflexive itself cannot be the locus of the splitting or nonsplitting of antecedents, as it occupies (the whole of ) an argument position and cannot be split; but such a reflexive can take as its unsplit antecedent another NP that itself has split antecedents). Not all non–obligatory control cases show overt CP structure, but at least the ones that do behave exactly as expected, uniformly allowing split antecedents. Conversely, obligatory control structures do not allow split antecedents. The anaphoric systems in other languages should reveal the same pattern: long-distance anaphors should allow split antecedents, anaphors with high locality have unsplit antecedents. In Japanese, for example, we might expect zibun and zibunzisin to di¤er in exactly this way. However, in checking the literature I have not found examples that unambiguously demonstrate this, independently of the splitting involved in infinitival control. If I am putting RT to correct use here, in trying to rationalize the ‘‘split antecedents’’ divide among anaphoric elements, then in fact that divide must be the tip of the iceberg, as every pair of RT levels has the potential to give rise to other, but related, kinds of distinctions. This will require sorting out the RT levels more precisely than I have been able to do here. An additional distinction is perhaps isolated in the following pair: (72) a. John wants to win. b. John wants himself to win. First, I think these do not di¤er at all regarding the possibility of split antecedents; in both cases the antecedent of the embedded subject is simply John. But another distinction has often been noted: namely, that (72a) has the de se reading and (72b) does not. Partee (1971) caught one aspect of this distinction in the contrast between the following pair, which di¤er sharply in their meanings: (73) a. Only John wants to win. b. Only John wants himself to win.
Semantics in RT
271
In standard theory this might be attributed to a di¤erence between PRO and himself. RT at least o¤ers the opportunity to interpret the di¤erences in another way. Significantly, the structure that gives rise to the de se reading is ‘‘smaller,’’ and therefore earlier, than the one that does not. In a related vein, RT levels can also be used to distinguish various kinds of quantifier scope assignment. The first clue is to understand how quantifier scope relates to various opportunities for reconstruction. We know, for example, that wh movement reconstructs for quantifier interpretation in some instances, and not in others, and in fact that NP movement itself reconstructs for certain quantifiers. Wh reconstruction for scope takes place in examples like this: (74) How many people does John think Bill saw t? This example is actually ambiguous between de dicto and de re interpretations, which can be schematized as follows: (75) a. John thinks [x many people [Bill saw t]] What is x? b. [x many people] [John thinks [Bill saw t]] What is x? (75a) represents the de dicto interpretation, which plausibly involves a quantifier having scope in the lower clause; (75b) represents the wide scope de re interpretation. (75a) certainly suggests that, in RT, wh movement can occur later than the construal of quantifiers like that many, by the theory’s general methodology. In the working model I have adopted for this book, that is not strictly speaking possible, but of course we might take SS to be an abbreviation of some number of levels in which this can be sorted out. Does (75b) suggest that quantifier construal occurs after wh movement as well? Quite possibly, I would guess, though not necessarily, as an embedded quantifier could have wide scope without the benefit of wh movement. When we turn to NP movement, we again find evidence for reconstruction—what have been called quantifier lowering cases with raising verbs. (76) Someone seems to have been here. a. for someone x, x seems to have been here b. seems [for someone x, x to have been here] In RT there will be no lowering; instead, there will be ordering. The construal of the quantifier someone precedes NP movement; since NP movement is associated with the level PS, quantifier construal must precede
272
Chapter 9
that level. The conclusion that presents itself from the data examined thus far is that quantifiers can be construed in any level; but in fact, quantifiers di¤er regarding where they are construed. (77) Not many boys are believed [t to have left]. a. not many boys [believed [t to have left]] b. believed [not many boys [t to have left]] Most speakers reject the narrow scope reading (77b). So, NP movement seems to reconstruct for construal of someone, but not for construal of not many. In RT this simply means that the levels (or range of levels) at which these two quantifiers are construed are di¤erent: one before, one after PS. In this regard RT mimics the findings of Beghelli and Stowell (1997) under the ‘‘later equals higher’’ equivalence discussed in chapter 2. That the existential is construed early is consistent with the fact that the implicit quantification of suppressed arguments is interpreted as existential, and with extremely narrow scope: (78) a. b. c. d.
They weren’t attacked. They weren’t attacked by someone. not [bx [x attacked them]] bx [not [x attacked them]]
(78a) can only have meaning (78c), whereas (78b) can have meanings (78c) and (78d). Perhaps the existential binding of implicit arguments is accomplished at TS, thus explaining its generally narrow scope— anything else will come later. Splitting up quantifier construals between levels raises some technical questions, to which I can at this point only stipulate arbitrary answers, but I suppose I should do at least that if only to show that the project is not incoherent. The general idea is this. As in earlier sections of this chapter, we have seen that the interpretation of structures ‘‘accumulates’’ across levels. Just as with anaphoric bindings, then, scope assignments that are established at earlier levels are preserved in later structure under Shape Conservation. Many questions remain unanswered. For example, why are some quantifiers excluded from early construal, and presumably, some excluded from late construal? I have no specific ideas about this, though I would of course note that it is a problem for the standard model as well. It is particularly troublesome for Beghelli and Stowell’s (1997) model, where quantifiers are assigned scope by moving them to preestablished,
Semantics in RT
273
dedicated positions in functional structure. The question is, why are those positions located where they are in functional structure?—essentially the same question that arises under the already mentioned ‘‘higher equals later’’ equivalence between the two styles of modeling the relation between syntax and semantics. But even with so much in darkness, I am encouraged to try to extend the LRT correlations of earlier chapters to questions of scope and antecedence, so as to lock together an even more disparate array of properties of syntactic relationships in a way I think is impossible in other models.
This page intentionally left blank
References
Abney, S. 1987. The English noun phrase in its sentential aspect. Doctoral dissertation, MIT. Anderson, S. 1982. Where’s morphology? Linguistic Inquiry 13, 571–612. Anderson, S. 1992. A-morphous morphology. Cambridge: Cambridge University Press. Andrews, A. 1982. The representation of Case in Modern Icelandic. In J. Bresnan, ed., The mental representation of grammatical relations, 427–503. Cambridge, Mass.: MIT Press. Babby, L. 1998a. Subject control in direct predication: Evidence from Russian. In Zˇ. Bosˇkovic´, S. Franks, and W. Snyder, eds., Formal Approaches to Slavic Linguistics 1997: The Connecticut Meeting, 17–37. Ann Arbor: Michigan Slavic Publications. Babby, L. 1998b. Voice and diathesis in Slavic. Ms., Princeton University. Bach, E. 1976. An extension of classical transformational grammar. In Problems in linguistic metatheory: Proceedings of the 1976 conference at Michigan State University, 183–224. East Lansing: Michigan State University, Department of Linguistics. Baker, M. 1985. The Mirror Principle and morphosyntactic explanation. Linguistic Inquiry 16, 373–415. Baker, M. 1996. The polysynthesis parameter. Oxford: Oxford University Press. Barrett-Keach, C. N. 1986. Word-internal evidence from Swahili for Aux/Infl. Linguistic Inquiry 17, 559–564. Bayer, J., and J. Kornfilt. 1994. Against scrambling as an instance of Move-alpha. In N. Corver and H. van Riemsdijk, eds., Studies on scrambling, 17–60. Berlin: Mouton de Gruyter. Beghelli, P., and T. Stowell. 1997. Distributivity and negation. In A. Szabolcsi, ed., Ways of scope taking, 71–107. Dordrecht: Kluwer. Benedicto, E. 1991. Latin long-distance anaphora. In J. Koster and E. Reuland, eds., Long-distance anaphora, 171–184. Cambridge: Cambridge University Press.
276
References
Besten, H. den. 1976. Surface lexicalization and trace theory. In H. van Riemsdijk, ed., Green ideas blown up: Papers from the Amsterdam Colloquium on Trace Theory. Publications of the Linguistics Department 13. Amsterdam: University of Amsterdam, Linguistics Department. Bodomo, A. B. 1998. Serial verbs as complex predicates in Dagaare and Akan. In I. Maddieson and T. J. Hinnebusch, eds., Language history and linguistic description in Africa. Vol. 2, Trends in African linguistics, 195–204. Trenton, N.J.: Africa World Press. Bok-Bennema, R. 1995. Case and agreement in Inuit. Berlin: Mouton de Gruyter. Bosˇkovic´, Zˇ. 1995. On certain violations of the Superiority Condition, AgrO, and economy of derivation. Ms., University of Connecticut. Bosˇkovic´, Zˇ. 1999. On multiple feature checking. In S. D. Epstein and N. Hornstein, eds., Working minimalism, 159–187. Cambridge, Mass.: MIT Press. Brody, M. 1997. Mirror theory. Ms., University College London. Brody, M., and A. Szabolcsi. 2000. Overt scope: A case study in Hungarian. Ms., University College London and New York University. Burzio, L. 1996. The role of the antecedent in anaphoric relations. In R. Freidin, ed., Current issues in comparative grammar, 1–45. Dordrecht: Kluwer. Chierchia, G. 1992. Functional wh and weak crossover. In D. Bates, ed., Proceedings of the 10th West Coast Conference on Formal Linguistics, 75–90. Stanford, Calif.: CSLI Publications. Chomsky, N. 1957. Syntactic structures. The Hague: Mouton. Chomsky, N. 1973. Conditions on transformations. In S. Anderson and P. Kiparsky, eds., A festschrift for Morris Halle, 232–286. New York: Holt, Rinehart and Winston. Chomsky, N. 1982. Barriers. Cambridge, Mass.: MIT Press. Chomsky, N. 1993. A minimalist program for linguistic theory. In K. Hale and S. J. Keyser, eds., The view from Building 20: Essays in linguistics in honor of Sylvain Bromberger, 1–52. Cambridge, Mass.: MIT Press. Chomsky, N. 1995. The Minimalist Program. Cambridge, Mass.: MIT Press. Cinque, G. 1998. Adverbs and functional heads. Oxford: Oxford University Press. Cinque, G. 2001. ‘‘Restructuring’’ and functional structure. Ms., University of Venice. Collins, C. 1996. Local economy. Cambridge, Mass.: MIT Press. Collins, C. 2001. The internal structure of verbs in Ju|’hoan and ¼jHoan. In A. Bell and P. Washburn, eds., Cornell working papers in linguistics 18. Ithaca, N.Y.: Cornell University, CLC Publications. Culicover, P., and W. Wilkins. 1984. Locality in linguistic theory. New York: Academic Press. De´prez, V. 1989. On the typology of syntactic positions and the nature of chains. Doctoral dissertation, MIT.
References
277
Diesing, M. 1992. Indefinites. Cambridge, Mass.: MIT Press. Di Sciullo, A.-M., and E. Williams. 1987. On the definition of word. Cambridge, Mass.: MIT Press. Fiengo, R., and R. May. 1994. Indices and identity. Cambridge, Mass.: MIT Press. Fodor, J. 1978. Parsing strategies and constraints on transformations. Linguistic Inquiry 9, 427–474. Fox, D. 1995. Economy and scope. Natural Language Semantics 23, 283–341. Gill, K.-H. 2001. The long-distance anaphora conspiracy: The case of Korean. Ms., University of Edinburgh. Grimshaw, J. 1978. English wh-constructions and the theory of grammar. Doctoral dissertation, University of Massachusetts, Amherst. Haegeman, L., and H. van Riemsdijk. 1986. Verb projection raising, scope, and the typology of rules a¤ecting verbs. Linguistic Inquiry 17, 417–466. Hankamer, J. 1973. Unacceptable ambiguity. Linguistic Inquiry 4, 17–68. Harley, H. 1995. Subjects, events, and licensing. Doctoral dissertation, MIT. Hoji, H. 1985. Logical Form constraints and configurational structures in Japanese. Doctoral dissertation, University of Washington. Hoji, H. 1986. Scope interpretation in Japanese and its theoretical implications. In M. Dalrymple, J. Goldberg, K. Hanson, M. Inman, C. Pin˜on, and S. Wechsler, eds., Proceedings of the 5th West Coast Conference on Formal Linguistics, 87–101. Stanford, Calif.: CSLI Publications. Holmberg, A. 1985. Word order and syntactic features. Doctoral dissertation, University of Stockholm. Huang, C.-T. J. 1982. Logical relations in Chinese and the theory of grammar. Doctoral dissertation, MIT. Kaplan, R., and J. Bresnan 1982. Lexical-Functional Grammar: A formal system for grammatical representation. In J. Bresnan, ed., The mental representation of grammatical relations, 173–281. Cambridge, Mass.: MIT Press. Kayne, R. 1975. French syntax. Cambridge, Mass.: MIT Press. Kayne, R. 1981. Two notes no the NIC. In A. Belletti, L. Brandi, and L. Rizzi, eds., Theory of markedness in generative grammar, 317–346. Pisa: Scuola Normale Superiore. Kayne, R. 1994. The antisymmetry of syntax. Cambridge, Mass.: MIT Press. Kenesei, I. 1994. The syntax of focus. Ms., University of Szeged. Kenesei, I. 1998. Adjuncts and arguments in VP-focus. Acta Linguistica Hungarica 45/1–2, 61–88. E´. Kiss, K. 1987. Configurationality in Hungarian. Dordrecht: Reidel. E´. Kiss, K. 1995. NP movement, operator movement, and scrambling in Hungarian. In K. E´. Kiss, ed., Discourse configurational languages, 207–243. Oxford: Oxford University Press.
278
References
Konapasky, A. 2002. A syntacto-morphological analysis of dependent heads in Slavic. Doctoral dissertation, Princeton University. Koopman, H., and A. Szabolcsi. 2000. Verbal complexes. Cambridge, Mass.: MIT Press. Koster, J. 1985. Reflexives in Dutch. In J. Gue´ron, H.-G. Obenauer, and J.-Y. Pollock, eds., Grammatical representations, 141–167. Dordrecht: Foris. Kuno, S., and J. Robinson. 1972. Multiple wh-questions. Linguistic Inquiry 3, 463–488. Kuroda, S.-Y. 1970. Remarks on the notion of subject with reference to words like ‘‘also,’’ ‘‘even,’’ or ‘‘only.’’ Annual Bulletin, vol. 3, 111–129; vol. 4, 127–152. Tokyo: Research Institute of Logopedics and Phoniatrics. Lako¤, G. 1972. On Generative Semantics. In D. Steinberg and L. Jakobovits, eds., Semantics, 232–296. Cambridge: Cambridge University Press. Landau, I. 1999. Elements of control. Doctoral dissertation, MIT. Lasnik, H. 1999. Minimalist analysis. Oxford: Blackwell. Lavine, J. 1997. Null expletives and the EPP in Slavic. Ms., Princeton University. Lavine, J. 2000. Topics in the syntax of non-agreeing predicates in Slavic. Doctoral dissertation, Princeton University. Mahajan, A. 1989. The A/A0 distinction and movement theory. Doctoral dissertation, MIT. Marantz, A. 1984. Grammatical relations. Cambridge, Mass.: MIT Press. Matthei, E. 1979. The acquisition of prenominal modifier sequences: Stalking the second green ball. Doctoral dissertation, University of Massachusetts, Amherst. Moltmann, F. 1990. Scrambling in German and the specificity e¤ect. Ms., MIT. Moortgat, M. 1988. Categorial investigations. Doctoral dissertation, University of Amsterdam. Mu¨ller, G. 1995. A-bar syntax: A study in movement types. Berlin: Mouton de Gruyter. Neeleman, A. 1994. Complex predicates. Doctoral dissertation, Utrecht University. Noyer, R. 1992. Features, positions, and a‰xes in autonomous morphological structure. Doctoral dissertation, MIT. Partee, B. 1971. On the requirement that transformations preserve meaning. In C. Fillmore and D. T. Langendoen, eds., Studies in linguistic semantics, 1–21. New York: Holt, Rinehart and Winston. Pesetsky, D. 1987. Wh-in-situ: Movement and unselective binding. In E. Reuland and A. ter Meulen, eds., The representation of (in)definiteness, 98–129. Cambridge, Mass.: MIT Press. Pica, P. 1991. On the interaction between antecedent-government and binding: The case of long-distance reflexivization. In J. Koster and E. Reuland, eds., Longdistance anaphora, 119–135. Cambridge: Cambridge University Press.
References
279
Pinker, S. 1984. Language learnability and language development. Cambridge, Mass.: Harvard University Press. Pollock, J.-Y. 1989. Verb movement, Universal Grammar, and the structure of IP. Linguistic Inquiry 20, 365–424. Postal, P. 1974. On raising: One rule of English grammar and its theoretical implications. Cambridge, Mass.: MIT Press. Prinzhorn, M. 1998. Prosodic and syntactic structure. Ms., University of Vienna. Reinhart, T., and E. Reuland. 1993. Reflexivity. Linguistic Inquiry 24, 657–720. Richards, N. 1997. What moves where when in which language? Doctoral dissertation, MIT. Riemsdijk, H. van. 1996. Adverbia en bepaaldheid. Ms., University of Tilburg. Riemsdijk, H. van, and E. Williams. 1981. NP Structure. The Linguistic Review 1, 171–217. Rivero, M.-L. 1991. Long head movement and negation: Serbo-Croatian vs. Slovak and Czech. The Linguistic Review 8, 319–351. Rizzi, L. 1982. Violations of the Wh-Island Constraint and the Subjacency Condition. In Issues in Italian syntax, 49–76. Dordrecht: Kluwer. Rizzi, L. 1990. Relativized Minimality. Cambridge, Mass.: MIT Press. Roeper, T., and M. Siegel. 1978. A lexical transformation for verbal compounds. Linguistic Inquiry 9, 199–260. Ross, J. R. 1970. On declarative sentences. In R. A. Jacobs and P. S. Rosenbaum, eds., Readings in English transformational grammar, 222–272. Waltham, Mass.: Ginn. Rudin, C. 1988. On multiple questions and multiple wh-fronting. Natural Language and Linguistic Theory 6, 445–501. Saito, M. 1991. Long distance scrambling in Japanese. Ms., University of Connecticut, Storrs. Saito, M. 1992. Long distance scrambling in Japanese. Journal of East Asian Linguistics 1, 69–118. Saito, M. 1994. Improper adjunction. In M. Koizumi and H. Ura, eds., Formal Approaches to Japanese Linguistics 1, 263–293. MIT Working Papers in Linguistics 24. Cambridge, Mass.: MIT, Department of Linguistics and Philosophy, MITWPL. Samek-Lodovici, V. 1996. Constraints on subjects: An optimality-theoretic analysis. Doctoral dissertation, Rutgers University. Santorini, B. 1990. Long distance scrambling and anaphora binding. Ms., University of Pennsylvania. Selkirk, E. 1982. The syntax of words. Cambridge, Mass.: MIT Press. Steedman, M. 1996. Surface structure and interpretation. Cambridge, Mass.: MIT Press.
280
References
Szabolcsi, A. 1996. Verb and particle movement in Hungarian. Ms., UCLA. Szendro˝i, K. 2001. Focus and the syntax-phonology interface. Doctoral dissertation, University of Southern California. Timberlake, A. 1979. Reflexivization and the cycle in Russian. Linguistic Inquiry 10, 109–141. Travis, L. 1984. Parameters and e¤ects of word order variation. Doctoral dissertation, MIT. Ueyama, A. 1998. Two types of dependency. Doctoral dissertation, University of Southern California. Vanden Wyngaerd, G. 1989. Object shift as an A-movement rule. In P. Branigan, J. Gaulding, M. Kubo, and K. Murasugi, eds., Student Conference in Linguistics 1989, 256–271. MIT Working Papers in Linguistics 11. Cambridge, Mass.: MIT, Department of Linguistics and Philosophy, MITWPL. Webelhuth, G. 1989. Syntactic saturation phenomena and the modern Germanic languages. Doctoral dissertation, University of Massachusetts, Amherst. Wilder, C. 1997. Some properties of ellipsis in coordination. In A. Alexiadou and T. H. Hall, eds., Studies on Universal Grammar and typological variation, 59–107. Amsterdam: John Benjamins. Williams, E. 1971a. Small clauses in English. Ms., MIT. Williams, E. 1971b. Underlying tone in Margi and Igbo. Ms., MIT. [Published 1976, Linguistic Inquiry 7, 463–484.] Williams, E. 1974. Rule ordering in syntax. Doctoral dissertation, MIT. Williams, E. 1977. Discourse and Logical Form. Linguistic Inquiry 8, 101–139. Williams, E. 1980. Predication. Linguistic Inquiry 11, 203–238. Williams, E. 1981a. Argument structure and morphology. The Linguistic Review 1, 81–114. Williams, E. 1981b. Language acquisition, markedness, and phrase structure. In S. Tavakolian, ed., Language acquisition and linguistic theory, 8–34. Cambridge, Mass.: MIT Press. Williams, E. 1981c. On the notions ‘‘lexically related’’ and ‘‘head of a word.’’ Linguistic Inquiry 12, 245–274. Williams, E. 1986. A reassignment of the functions of LF. Linguistic Inquiry 17, 265–299. Williams, E. 1987. Implicit arguments, the binding theory, and control. Natural Language and Linguistic Theory 5, 151–180. Williams, E. 1991. ‘‘Why crossover?’’ Handout, colloquium presentation, MIT. Williams, E. 1994a. Negation in English and French. In D. Lightfoot, ed., Verb movement, 189–206. Cambridge, Mass.: MIT Press. Williams, E. 1994b. Thematic structure in syntax. Cambridge, Mass.: MIT Press. Williams, E. 1997. Blocking and anaphora. Linguistic Inquiry 28, 577–628.
References
281
Williams, E. 1998a. The asymmetry of predication. In R. Blight, ed., Texas Linguistic Forum 38, 323–333. Austin: University of Texas, Texas Linguistic Forum. Williams, E. 1998b. Economy as shape conservation. In Celebration: An electronic festschrift in honor of Noam Chomsky’s 70th birthday. http://addendum.mit.edu/ celebration. Williams, E. In preparation. The structure of clusters. Ms., Rutgers University. [To be presented at NIAS/Collegium Budapest Cluster Study Group.] Wiltschko, M. 1997. D-linking, scrambling and superiority in German. Groninger Arbeiten zur germanistischen Linguistik 41, 107–142. Wurmbrand, S. 1998. Infinitives. Doctoral dissertation, MIT. Yatsushiro, K. 1996. On the unaccusative construction in nominative Case licensing. Ms., University of Connecticut, Storrs. Yip, M., J. Maling, and R. Jackendo¤. 1987. Case in tiers. Language 63, 217– 250. Zubizarreta, M. L. 1998. Prosody, focus, and word order. Cambridge, Mass.: MIT Press. Zwart, C. J.-W. 1997. Morphosyntax of verb movement: A minimalist approach to the syntax of Dutch. Dordrecht: Kluwer.
This page intentionally left blank
Index
A/A0 distinction, 72, 118–121, 171 relativization of, 96, 121, 130–133 Ablative absolute, 192–193 Accent Structure (AS), 243, 251–261 Adjective order, 153–154 Adjuncts and X-bar, 61–62 Adverb positioning, 44 Anaphora, 95–116 Antisymmetry, 19–21 Arabic inflection, 217–218 Assume Lowest Energy State, 163 Benedicto, E., 98–99 Binding, 120 Blocking in semantics, 10, 246–249 Bracketing paradoxes, 5–8 Bridge verbs, 69 Bulgarian, 145–146, 154–157, 168 Burzio, L., 112–113 Case structure, 13 Case-preposition duality, 188–194 CAT, 203–238 Causativization, 66–67 Checking Theory, 29, 35–36 Cinque, G., restructuring verbs, 90–91 functional structure, 201–202 Complement-of relation, 179 Complementizer agreement, 196 Compositionality, 240–246 Contraction, 163–164 Control, 85–86, 269–270 obligatory/optional distinction, 87–88 Copular inversion, 254–256 Countercyclic derivation, 70–71 CS embedding, 67–69 Czech verb clusters, 237–238 D-linking, 41, 144, 148–149 Disanaphora Principle, 250
Dutch reflexive, 101–103 Dutch verb clusters, 224–229 ECM as CS embedding, 15, 67–68, 105 Ellipsis, 261–266 Embedding, 25 functional vs. complement, 59, 174–176, 199–201 English auxiliary system, 222–224 EPP subjects, 83–85 Equidistance, 16 Ergative case, 110–111 Excorporation, 187–188 Expletives, 68, 92 Extension, 73, 114 Flip, 206–211 Focus, 34 IFocus vs. LFocus, 249–261 normal and contrastive, 32, 249 Focus ellipsis, 261 Focus Structure (FS), 30–33 FS embedding, 60–70 Functional structure, 173 Gapping, 193–194, 264–266 General Ban on Improper Movement (GBOIM), 72 General Condition on Scope, 22 Georgian inflection, 217–218 German restructuring verbs, 89–91 scrambling, 39–44, 119, 122–124, 126–129 V2 vs. V-final, 78–79 WCO, 143–145 Haegeman, L., and H. van Riemsdijk, (1986), 224–229 Head-complement relation, 11–12 Head Movement Constraint, 171
284 Heavy NP Shift, 33–38 Holmberg’s generalization, 17–19 Hungarian focus, 36, 259–260 scope, 45–50 scrambling, 160–165 verb clusters, 229–237 IFocus, 249–261 Improper movement, 71–75 Induced representation, 244 IPresupposition, 252–253 Japanese long vs. short scrambling, 157–161 reflexive, 97 scope and scrambling, 124–126 Konopasky, A., 148–152 Koopman, H., and A. Szabolcsi (2000), 229–237 L-tous, 79–80 Latin reflexive, 98–99 Level Blocking Principle, 95 102 Level Embedding Conjecture, 63–65 Lexical-Functional Grammar, 22, 38 Lexical Variation Hypothesis (LVH), 212, 219 Lexicalism, 172–173, 202 LFocus, 249–261 Locality, 164–167 Long topicalization, 130–132 LPresupposition, 252–253 LRT correlations, 59, 117–135 Mirror Principle, 15, 178, 199–203 Mohawk inflection, 219–220 Movement, 26, 62 Multiple exponence, 194–196, 216 Multiple-WH movement, 145–147 Navajo inflection, 221–222 Nominative case, 109 NP structure model, 118, 127 Optimality theory, 22, 38 Pa¯nini’s principle, 7, 10 ˙ Parallel movement, 139 141 Predicate Structure (PS), 86 87, 106 112 QS, 30 33 Quantifier interpretation and reconstruction, 42 43, 271 273 Quantifier scope, 42 Quirky case, 81 83, 110 112
Index Reassociation, 188 194, 206 211 Reconstruction, 117–135 and quantifier interpretation, 271–273 Reinhart, T., and E. Reuland (1993), 99– 101, 104–106, 108 Relativized Minimality, 185–188 Remnant movement, 19–21, 133–135 Representation, 13–14 asymmetry of, 60 as homomorphism, 61 model for, 23–24 Richards, N., 140, 149–150, 158–159, 166 Right node raising (RNR), 37 Rule of Combination (RC), 204–205 Russian subject position, 52–55, 83–85 Scrambling, 39–44, 117–135, 157–165 long vs. short, 157–161 masked, 161 Selection 183 Self-, 100 Semantic compositionality, 242–246 Semantic interpretation, 25 Serbo-Croatian, 151–152 verb clusters, 187–188, 237–238 Serial verbs, 65–66 Shadow, 175 177 Shape Conservation, 5, 7–8, 15–23, 239– 242, 246 Small clause theory, 63 Southern Tiwa inflection, 219–220 Spanish focus, 50–52, 256–259 Spanning vocabulary, 214–215 Split antecedents, 267–269 SS embedding, 69–70 Subcategorization, 203–204 Subject auxiliary inversion, 191–192 Subjects EPP, 83–85 quirky, 81–83 and scrambling, 126–129 Superiority, 140–145, 158–159 Superraising, 77–78 Swahili inflection, 220–221 object agreement, 248–249 Swiss German verb clusters, 224–229 Synonymy Principle, 247 Synthetic compounds, 9–13 Target, 95–96 Theta Structure, 13 Topic, 30–32 Tough movement, 75–77 TS Embedding, 65–67 Ueyama, A., 124–126
Index V2, 78–79, 191–192 Verb projection raising, 224 –229 Verb-particle construction, 233 Verbal modifier (Hungarian), 229–237 VP ellipsis, 262–264 Weak Crossover, 141–145, 154 West Flemish verb clusters, 224–229 Wiltschko, M., 143–145 Wurmbrand, S. (restructuring verbs), 89–90 X-bar theory, 175–185 Yip, M., J. Maling, and R. Jackendo¤ (1987), 109–112 Yuman (Lakhota, Alabama) inflection, 222
285
This page intentionally left blank
Current Studies in Linguistics Samuel Jay Keyser, general editor 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.
12. 13. 14. 15. 16. 17. 18. 19. 20. 21.
A Reader on the Sanskrit Grammarians J. F. Staal, editor Semantic Interpretation in Generative Grammar Ray Jackendo¤ The Structure of the Japanese Language Susumu Kuno Speech Sounds and Features Gunnar Fant On Raising: One Rule of English Grammar and Its Theoretical Implications Paul M. Postal French Syntax: The Transformational Cycle Richard S. Kayne Pa¯nini as a Variationist Paul Kiparsky, S. D. Joshi, editor Semantics and Cognition Ray Jackendo¤ Modularity in Syntax: A Study of Japanese and English Ann Kathleen Farmer Phonology and Syntax: The Relation between Sound and Structure Elisabeth O. Selkirk The Grammatical Basis of Linguistic Performance: Language Use and Acquisition Robert C. Berwick and Amy S. Weinberg Introduction to the Theory of Grammar Henk van Riemsdijk and Edwin Williams Word and Sentence Prosody in Serbocroatian Ilse Lehiste and Pavle Ivic´ The Representation of (In)definiteness Eric J. Reuland and Alice G. B. ter Meulen, editors An Essay on Stress Morris Halle and Jean-Roger Vergnaud Language and Problems of Knowledge: The Managua Lectures Noam Chomsky A Course in GB Syntax: Lectures on Binding and Empty Categories Howard Lasnik and Juan Uriagereka Semantic Structures Ray Jackendo¤ Events in the Semantics of English: A Study in Subatomic Semantics Terence Parsons Principles and Parameters in Comparative Grammar Robert Freidin, editor Foundations of Generative Syntax Robert Freidin
22. Move a: Conditions on Its Application and Output Howard Lasnik and Mamoru Saito 23. Plurals and Events Barry Schein 24. The View from Building 20: Essays in Linguistics in Honor of Sylvain Bromberger Kenneth Hale and Samuel Jay Keyser, editors 25. Grounded Phonology Diana Archangeli and Douglas Pulleyblank 26. The Magic of a Common Language: Jakobson, Mathesius, Trubetzkoy, and the Prague Linguistic Circle Jindrˇich Toman 27. Zero Syntax: Experiencers and Cascades David Pesetsky 28. The Minimalist Program Noam Chomsky 29. Three Investigations of Extraction Paul M. Postal 30. Acoustic Phonetics Kenneth N. Stevens 31. Principle B, VP Ellipsis, and Interpretation in Child Grammar Rosalind Thornton and Kenneth Wexler 32. Working Minimalism Samuel Epstein and Norbert Hornstein, editors 33. Syntactic Structures Revisited: Contemporary Lectures on Classic Transformational Theory Howard Lasnik with Marcela Depiante and Arthur Stepanov 34. Verbal Complexes Hilda Koopman and Anna Szabolcsi 35. Parasitic Gaps Peter W. Culicover and Paul M. Postal, editors 36. Ken Hale: A Life in Language Michael Kenstowicz, editor 37. Flexibility Principles in Boolean Semantics: The Interpretation of Coordination, Plurality, and Scope in Natural Language Yoad Winter 38. Phrase Structure Composition and Syntactic Dependencies Robert Frank 39. Representation Theory Edwin Williams