A Derivational Approach to Syntactic Relations
This page intentionally left blank
A Derivational Approach to Syntac...
40 downloads
1228 Views
9MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
A Derivational Approach to Syntactic Relations
This page intentionally left blank
A Derivational Approach to Syntactic Relations
Samuel David Epstein Erich M. Groat Ruriko Kawashima Hisatsugu Kitahara
New York
Oxford
OXFORD UNIVERSITY PRESS
1998
Oxford University Press Oxford New York Athens Auckland Bangkok Bogota Buenos Aires Calcutta Cape Town Chennai Dar es Salaam Delhi Florence Hong Kong Istanbul Karachi Kuala Lumpur Madrid Melbourne Mexico City Mumbai Nairobi Paris Sao Paulo Singapore Taipei Tokyo Toronto Warsaw and associated companies in Berlin Ibadan
Copyright © 1998 by Samuel David Epstein, Erich M. Groat, Ruriko Kawashima, and Hisatsugu Kitahara Published by Oxford University Press, Inc. 198 Madison Avenue, New York, New York 10016 Oxford is a registered trademark of Oxford University Press All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior permission of Oxford University Press. Library of Congress Cataloging-in-Publication Data A derivational approach to syntactic relations / Samuel David Epstein . . . [et al.]. p. cm. Includes bibliographical references and index. ISBN 0-19-511114-1; ISBN 0-19-511115-X (pbk.) 1. Grammar, Comparative and general—Syntax. 2. Generative grammar. I. Epstein, Samuel David. P291.D476 1998 415—dc21 986637
1 3 5 7 9 8 6 4 2 Printed in the United States of America on acid-free paper
Acknowledgments
The idea of derivational C-command was first presented by Samuel David Epstein at the Harvard University Linguistics Department Forum in Synchronic Linguistic Theory in 1994. A subsequent version of Epstein's paper was presented in April 1995 to the Linguistics Department at the University of Maryland at College Park; finally, in January of 1997 these ideas were presented at the Linguistics Program at the University of Michigan. Shortly before Epstein's first presentation, Erich M. Groat, Ruriko Kawashima, and Hisatsugu Kitahara had a chance to read the original manuscript. Therewith were launched several independently initiated but related research projects following Epstein's idea. In 1995, some of the results of these projects were presented at various conferences/colloquia (including the eighteenth Generative Linguistics in the Old World (GLOW) Colloquium and the fourteenth West Coast Conference on Formal Linguistics). Some of them turned into a Ph.D. thesis; chapters 3-6 appear in a somewhat different form in Groat (1997). This monograph grew out of these and subsequent works, upon the realization that an unusual and (we hope) inspiring picture of the architecture of natural language came into view once the essential ideas and implications were taken together. The final product is entirely a group effort, though some division of labor may be indicated. To a very good first approximation, the introduction was jointly written by Epstein and Groat, chapter 1 by Epstein, chapter 2 by Kawashima and Kitahara, and chapters 3-6 by Groat. Special gratitude goes to the institutions where this research project was conducted: the department of linguistics and philosophy at the Massachusetts Institute of Technology, the department
VI
ACKNOWLEDGMENTS
of linguistics at Harvard University, the program in linguistics at Princeton University, and the department of linguistics at the University of British Columbia. The material in chapter 2 benefited from presentations at the University of Southern California (the fourteenth West Coast Conference on Formal Linguistics), Princeton University, Harvard University, the University of British Columbia, Meiji Gakuin University, Sophia University, Tohoku Gakuin University, Hokkaido University, the University of Ottawa (the Open Linguistics forum: "Challenges of Minimalism"), and Nanzan University. The research for chapter 2 was supported in part by the Toyota Foundation (research grant given to Hisatsugu Kitahara, 1996-97). Much earlier versions of chapters 3-6 were presented at the eighteenth GLOW Colloquium in Troms0, Norway (1995), as well as at the City University of New York (1996). Most important, we are grateful to the individuals who aided us, corrected us, and made us wiser: Jun Abe, Markus Biehler, Bob Berwick, Jonathan Bobaljik, Maggie Browning, Juan Carlos Castillo, Noam Chomsky, Catherine V. Chvany, Scott Ferguson, Suzanne Flynn, Robert Frank, Robert Freidin, Hajime Fukuchi, Minoru Fukuda, Naoki Fukui, Gunther Grewendorf, Jeffrey Gruber, Sam Gutmann, Mark Hale, Norbert Hornstein, Dany Jaspers, Yasuhiko Kato, Richard Kayne, Sung-Hun Kim, Masatoshi Koizumi, Howard Lasnik, David Lieb, Elaine McNulty, Tomiko Narahara, Jairo Nunes, Masayuki Oishi, John O'Neil, Jean-Yves Pollock, Geoff Poole, Michael Rochemont, Joachim Sabel, Mamoru Saito, Nick Sobin, Adam Szczegielniak, Hoskuldur Thrainsson, Jindrich Toman, Shigeo Tonoike, Esther Torrego, Masanobu Ueda, Juan Uriagereka, Mary Violette, Ken Wexler, Larry Wilson, and Charles Yang. We also thank Elizabeth Pyatt and especially Steve Peter and George Willard for absolutely indispensable editorial assistance. Finally, we thank editor Peter Ohlin and production editor Cynthia L. Garver of Oxford University Press for their expertise and patience.
Contents Introduction
3
1. The Derivation of Syntactic Relations
16
2. A Derivational Application of Interpretive Procedures
46
3. Derivational Sisterhood
84
4. Minimize Sisters and Constraints on Structure Building
111
5. The LCA, Cyclicity, Trace Theory and the Head Parameter
139
6. On Derivationality
161
Bibliography
181
Index
189
This page intentionally left blank
A Derivational Approach to Syntactic Relations
This page intentionally left blank
Introduction
The theoretical construct "transformational rale," a binary relation mapping one phrase-marker into another, has long played a central role in generative syntactic theory. Our research explores the following question: What exactly is the empirical content of transformational rules as formulated in contemporary transformational theories of universal syntax? Essentially, we will advance the hypothesis that the structure-building rules Merge and Move (Chomsky 1994) naturally express all syntactically significant relations; that is, if X and Y are concatenated, then X and Y naturally enter into syntactically significant relations. If correct, this theory will allow us to dispense with stipulated and hence unexplained definitions of syntactic relations defined on phrasestructure representations. From Syntactic Structures through the Extended Standard Theory framework, transformations were both language-specific and construction-specific: the linguistic expressions of a given language were construed as a set of structures generable by the phrasestructure and transformational rules particular to that language. Transformational rules thus played a crucial role in the description of particular languages. The advent of the Principles and Parameters (henceforth, P&P) framework called this model into question, first and perhaps foremost on the basis of problems of learnability: not only did the language learner have to ascertain the formal, language-specific properties of the transformational rules that generated the expressions of his/her particular language (i.e., determine both the structural description defining the input structures to which the rule could be applied and the structural change that specified the output of transformational rule-application), but the learner also had to determine both the relative order of application and the obli3
4
INTRODUCTION
gatoriness versus optionality of each of the rules acquired. Given the poverty of the stimulus, explanatory adequacy was not attained (and was perhaps unattainable within such a framework). The search for a restricted theory of transformations led to a search for universal conditions on transformations (Chomsky 1973) and ultimately to the foundations of the P&P framework (notably Chomsky 1981, Chomsky and Lasnik 1993), which sought to supplant the language-specific and construction-specific aspects of individual rules with a universal set of principles constraining phrase-structure representations, along with a set of parameters that particular languages set in particular perhaps one of two ways. Concurrently, the theory of transformational rules underwent drastic simplification. The problem posed by the descriptive character of construction-specificity was avoided by positing no construction-specific rules at all, while the problem of languagespecificity was reduced to the question of what constituted the set of parametric values that could vary from language to language (and whose values could be set upon exposure to readily available triggering data). As concerns this shift away from rules and toward a principle-based theory, Chomsky (1982:7-8) writes: One key question is, How much of the rule system must actually be specified in a particular grammar? Or equivalently, What aspects of the rule system must actually be learned? . . . Much of the research over the past 20 years within the general outlines of the theory of transformational generative grammar has been devoted to narrowing the range of possible alternatives consistent with available data concerning certain wellstudied languages. In the course of this work, there has been a gradual shift of focus from the study of rule systems, which have been regarded as increasingly impoverished (as we would hope to be the case), to the study of principles, which appear to occupy a much more central position in determining the character and variety of possible human languages. Ultimately, the structure-building operations—namely transformational rules and phrase-structure rules—were radically simplified to Affect-a (Lasnik and Saito 1984, 1992) and X' theory
INTRODUCTION
5
(Jackendoff 1977). The "wild overgeneration" of the impoverished rule system was constrained by principles—that is, wellformedness conditions on representations called filters, applied to the output representations generated by iterative application of unconstrained structure-building rules. Filters apply to syntactic representations, requiring that certain objects and structural relations not appear within representations at a particular level. Recent work following and including Chomsky's work of the 1990s revises this framework in significant ways. The power of a simple and universal transformational component is maintained, but the apparatus of filters applying to levels of syntactic representation is rejected. Linguistic expressions are instead constrained in two fundamental ways. The first strongly resembles the mechanism of filters, insofar as certain interface representations are required to display certain properties: well-formedness conditions on phrase-markers (i.e., representations) are determined extrasyntactically by the requirements of those cognitive systems (including at least the articulatory/perceptual and conceptual/intentional systems) that are presumed to interface with (that is, operate on the output of) the syntactic computational system. Chomsky (1994, 1995) calls such interface conditions the "Bare Output" conditions on PF and LF representations. Like filters on syntactic representations, they constrain the possible output representations of the generative system, but unlike filters in the P&P system, they are neither syntactic principles, nor subject to parametric variation. The specific properties of lexical items determine what structures lexical items may ultimately appear in at the interface levels. The second means of constraining linguistic expressions relies more heavily on the transformational component. Well-formedness conditions on phrase-structure representations (such as X' structure) are rejected in favor of a derivational approach to structurebuilding, whereby admissible structures are determined by whether or not they can be constructed by an apparatus of Binary and Singulary Generalized Transformations (GT)—namely the rules Merge and Move, respectively. Move is understood to apply only if required to create a representation that is interpretable at the interface levels (the principle Greed). The rules Merge and Move,
6
INTRODUCTION
defining the set of admissible transformations on a given phrasemarker, subsume certain well-formedness conditions on phrasestructure. Within this model, the transformational component is implicated in all aspects of structure-building. There is no bifurcation, as there was from Standard Theory through P&P theories, between the creation of deep structure (i.e. the "base," generated by phrasestructure rules) and the transformational derivation of surface structure from deep structure. In this respect, the approach represents a return to a previous approach, which placed a greater empirical burden on the transformational component. At the same time, the approach preserves the advantages of the P&P model, in that the structure-building apparatus is simple, unified, and universal, while language-specific properties are a reflex of the interaction of the irreducibly language-specific (and parameterized) properties of lexical items (in particular, morpho-lexical features) operating in concert with the structure-building apparatus, "seeking" to satisfy the requirements imposed by the interfaces. While the minimalist approach largely replaces filters on freely generated phrase-structure representations by constraints on the structure-building apparatus itself—that is, by constraints on derivations (such as the definitions of Merge/Move, strict cyclicity, "Shortest Move" requirements, Greed)—the absolutely fundamental notion of "syntactic relation" is still viewed representationally. When we examine the structural configurations that appear to be linguistically significant, both within the derivation (e.g., featurechecking configurations) and to post-syntactic interpretive components (e.g., binding configurations or configurations determining precedence relations), it is clear that the Minimalist Program strongly retains a representational approach to syntactic relations. For example, in order for a category to check some feature that would otherwise be uninterpretable at one or both of the PF and LF interfaces, that category must undergo Move to a position defined representationally to be, at that point in the derivation, within the Checking Domain of a head capable of checking the feature in question. Thus Chomsky, appealing to intermediate representations, defines the purportedly unifying successor of the Government relation, namely, the Domain of a head H°, in turn bifurcated into Checking Domain and Internal Domain.
INTRODUCTION
7
As another example, for a category to bind another category, the conceptual-intentional system of semantic interpretation must examine the LF phrase-marker and determine whether a C-command relation holds: again, this is a representationally defined relation. Each of these definitions, though well-motivated by empirical research, remains stipulative: why are "C-command" and "Checking Domain" defined as they are? Given a formal object as complex as a tree structure, there are an infinite number of possible relations between its elements that could be defined, yet we find that only a very few of these relations seem syntactically significant. Why? Unexplained definitions of representational relations fail to answer the question. The approach we advance in our theory proposes an answer to these absolutely fundamental questions. Just as well-formedness conditions on phrase-structure such as X' theory can seemingly be eliminated in favor of a simpler and independently motivated apparatus of structure-building, incorporating the rules Merge/Move, we propose that syntactic relations such as C-command and the Checking Relation may be seen to follow from the formal nature of the arguably irreducible, maximally simple, and perhaps unifiable rules Merge and Move themselves. Again, in a nutshell, if X and Y are concatenated (by Merge/Move), then X and Y enter into syntactic relations. If this view is correct, then the stipulation of syntactic relations as arbitrary, representationally defined configurations can be dispensed with under the derivational approach to structure-building. Syntactic relations, in our view, are not properties defined on representations generated by syntactic rule but are properties inherent to and established by the application of the rules Merge and Move themselves. The set of syntactic relations holding of a linguistic expression follows directly, and by hypothesis entirely, from the rules and their order of application that generate the expression—that is, its derivation, and not from stipulative definitions of relations on the phrase-markers created by the derivation. The result is a derivational approach to syntactic relations. What does it mean to claim that a syntactic relation is a property of rule-application, as opposed to being a property defined on a representation? As a brief illustration of the idea, we provide here a sketch of the derivational approach to C-command (which we develop in detail in chapter 1).
8
INTRODUCTION
Consider the merger of two categories A and B, with A projecting (under bare-theoretic assumptions regarding phrase-structure). Assume that A and B are heads drawn from the lexicon. The application of Merge creates a new syntactic category, C = AP, as in (i): (i) Merge(A, B) -> C=[AP A B]
Consider now the following definition of C-command from Reinhart(1979): (ii)
and and
Representational C-command A C-commands B iff: a. The first branching node dominating A dominates B, b. A does not dominate B, c. A does not equal B.
Of course, since this is a definition, it is non-explanatory—that is, we have no answer to the question of why this particular relation, as opposed to one of an infinite number of other definable relations, should be syntactically significant. Given this representational definition of C-command, we see that in (i) A C-commands B, B C-commands A, and C C-commands no category. But note that the terms that C-command each other are precisely the terms that were merged with each other. namely, A and B. C is not in any C-command relation in the tree (see (i))—and it was not merged in generating (i); rather, C is the output of Merge (A, B). There thus appears to be redundancy between the statements "Merge (A, B), forming C" and "A and B, by definition, C-command each other in the representation of the phrase-marker C." This observation is central to our proposal.
INTRODUCTION
9
Now consider a second application of Merge, in which we take a category D (again a head drawn from the lexicon) and merge it with C, again projecting A. This rule-application forms E = [AP D [A'AB]]. (iii)
Merge (D, C)
E = [AP D [A A B]](Now: C = [A A B]])
Under the representational definition, D C-commands C (= A'), A, and B, while C (= A') C-commands only D. E = AP C-commands no category. Note again that the terms that C-command each other are precisely the terms merged with each other. C and D (and A and B, as discussed previously). Note, however, that asymmetric C-command relations also obtain in this representation: D C-commands A and B, while neither A nor B C-commands D. But observe that D was merged with a category containing A and B. It looks as if a category enters into C-command relations with other terms only when it is merged. Thus when D entered the phrase-marker E by being merged with C, A and B were extant as subtrees of C—and, strikingly, there is a C-command relation from D to A and from D to B in the resulting representation. On the other hand, when A and B were merged, D was not yet in the tree—and, strikingly, there is no C-command relation either from A to D or from B to D in the resulting representation. There is thus complete redundancy between the C-command relations implicit in the derivation of the tree in (iii)—namely, "Merge (A, B), then Merge (C, D)"—and the C-command relations
10
INTRODUCTION
defined representationally on the tree. We can eliminate this redundancy by making explicit the Ccommand relations that obtain given an application of Merge (dispensing with the representational definition). The following definition formalizes this idea: (iv)
Derivational C-Command X C-commands all and only the terms of the category Y with which X was paired/concatenated by Merge or Move in the course of the derivation.
Using this definition, we see that to determine the C-command relations that hold within a given phrase-marker, we examine not the structure of the phrase-marker (a representation) but, rather, its derivation, expressed as a partially ordered sequence of Merge (and ultimately also Move) operations. The preceding illustration of how a derivational definition of C-command might be developed is intended simply to show what we mean by attributing a syntactic relation to a rule-application as opposed to a representation. The immediate empirical consequences of this change are explored. We identify the consequences of this definition for X' invisibility and demonstrate how overt (but not covert) strict cyclicity follows from the derivational definition when coupled with Kayne's (1994) Linear Correspondence Axiom (henceforth LCA), effectively deducing Chomsky's (1993) stipulation that the Extension Condition holds only for overt substitution operations. But the significance of our derivational approach, we believe, goes well beyond these immediate empirical consequences. First, it may be possible to deduce the derivational definition of C-command from independent and conceptually necessary principles. If our deduction is correct, then we may eliminate the definition of Ccommand entirely: C-command becomes the only syntactic relation that is natural without further stipulation. This constitutes what might well be a significant advance in the understanding of syntactic computation. Second, regardless of whether the derivational definition is deducible, its incorporation into syntactic theory allows us to pro-
INTRODUCTION
]1
ceed toward a theory in which syntactic representations play no role outside the syntactic derivation itself: they do not serve as a set of instructions to other cognitive systems (such as PF computation or semantic interpretation). This subtle but radical shift in the theory has certain highly desirable conceptual and empirical consequences. Adopting the derivational definition of C-command not only requires the standard examination of immediate empirical consequences that follow from the adoption of any new definition, but also has consequences that suggest a new way of conceiving the syntactic computational system. As an example of the consequences of our approach, let us consider the following paradox seemingly inherent in Chomsky's model of a minimalist grammar. Recall that the minimalist approach eliminates the mechanism of syntactic filters applying to syntactic levels, replacing them by appeal to what are currently largely undetermined interpretive requirements imposed by postsyntactic computation, the "Bare Output Conditions" (Chomsky 1994, 1995), which determine allowable interface representations on the basis of the interpretive requirements of semantic and phonological computation external to the syntactic component. In the formalization of this approach, Chomsky distinguishes between objects (syntactic categories and features) that are legitimate and those that are illegitimate at the interfaces (Chomsky 1991). As an example, unchecked formal features are hypothesized to be uninterpretable to semantic—non-syntactic—computation. Thus such features must be checked before the phrase-marker can be interpreted; unchecked features are illegitimate. At the same time, the syntactic operation Move is hypothesized to be constrained by Greed, which states that Move applies only in order to check features that would otherwise be uninterpretable—that is, only in order to create legitimate objects. In fact, any syntactic operation other than binary Merge (such as Move) is generally constrained by the requirement that legitimate, interpretable objects result from the application of that operation (Last Resort; see also Epstein 1992, Lasnik 1995). Notice that the notion of "legitimacy" is defined only with respect to non-syntactic interpretation. But if post-syntactic interpretation applies only to a single output phrase-marker, generated by syntactic structure-building rules (such as Move), how
12
INTRODUCTION
can there be a well-defined notion of legitimacy characterizing "licit rule-application" at some intermediate point in the derivation, to which post-syntactic computation has no access? By requiring that, internal to the syntax, each rule-application be constrained by the ultimate requirements of the interfaces (the Bare Output Conditions), we arrive at a paradox: the requirements of the interfaces, by hypothesis, apply exclusively at the interfaces, and not internal to a derivation—yet Last Resort, requiring that rules apply only to yield "legitimacy," entails that legitimacy must be defined with respect to rule-application within a derivation. In our model, no such paradox arises. Given that syntactic relations in our theory are properties of rule-application, and hence of derivations, the post-syntactic computational systems must examine not "output representations" to determine such relations but, rather, the derivation, which consists of the partially ordered application of syntactic rules. Hence, at each point in the derivation at which a rule (such as Move) is applied, interface conditions hold, and the notion of "legitimacy" remains well-defined internal to the derivation. In the case of feature checking, for example, Move will be allowed only if it has the property that a Checking Relation is established that renders legitimate some otherwise uninterpretable feature. Effectively, this result is the same as the current minimalist Last Resort requirement "Greed"—but differs crucially in that the "legitimacy" of a category remains a purely extrasyntactic notion: the computational systems beyond the interfaces "see" the operation Move, and thus "see" the Checking Relation that results in the legitimacy of the feature being checked. Semantic and phonological interpretation is thus "invasive" to the syntax, being isomorphic to the syntactic derivation and interpreting it as it proceeds, rather than interpreting an output phrasemarker representation generated by the derivation. There is no phrase-marker that serves as the sole object of semantic interpretation (the minimalist carry-over of an LF "level" of representation); nor is there a phrase-marker that is "Spelled-Out" to the phonological component (the minimalist analog of an S-Structure "level" of representation). The "Y-model" of syntax is thus eliminated. For every syntactic operation there is a corresponding interpretive operation in the semantic and/or PF components.
INTRODUCTION
13
This theory of syntax might comport well with a compositional approach to semantic interpretation. Instead of cyclic, syntactic construction of an entire phrase-marker followed by cyclic, semantic composition which retraces the steps of the syntax, the syntax instead serves as a set of instructions to the semantic component to build cyclically composed structures, in the manner of Montagovian categorial grammar. This simplification resembles the resolution (attained by the minimalist return to Generalized Transformations) of a long-standing problem with respect to purely syntactic strict cyclicity in preminimalist theories. Why should phrase-structure rules create a complete (deep) structure representation, only to have transformational rules apply cyclically to subtrees of that representation? Cyclic transformational rule-application applies exactly as if higher structure didn't yet exist. Why? These questions disappear when deep structure is eliminated and structure-building subsumes both base generation and transformations on the structure built so far— that is, derivationally. Similarly, why should the syntax build a structure cyclically, only to have compositional semantic computation apply cyclically to the already constructed subtrees of the completed interface structure? In our model, this question also disappears, since semantic interpretation is "invasive," proceeding concurrently with structure-building, i.e. derivationally. In sum, our theory has the following innovative properties: A. The syntactic computational system consists only of the (perhaps unifiable; see Kitahara 1994, 1995) syntactic rules Merge and Move. Syntactic relations are not unexplained relations defined on representations generated by these rules. They are instead inherent to the rules themselves: concatenation (Merge/ Move) creates relations. The arguably unifying, but entirely stipulative, non-explanatory and complex representational definitions of Government and/or Minimal Domain are eliminated. B. There are no syntactic representations that serve as instructions to post-syntactic computation or interpretation. Representations are objects created by post-syntactic computation on the basis of syntactic derivational steps. Such steps are a partly
14
INTRODUCTION
ordered list (a "quasi order") of successive application of the rules Merge and Move. C. There are consequently no representational definitions of syntactic relations, either at the interfaces (e.g. "C-command" used in determining binding or precedence relations (Kayne 1994)), or internal to derivations (e.g., "Checking Domain"). D. There are no independent interface levels of representation. Interpretation, both semantic and phonological/morphological, is invasive to the syntax: for every syntactic rule-application, there is at least one corresponding interpretive operation in the PF branch and/or the semantic component. E. Consistent with minimalist assumpotions, there can be no stipulated "filters" on non-existent levels of representation. There are only lexical items and operations on these items that ensure the interpretability of their formal, semantic, and morpho-phonological features. The theory we suggest thus results in a significant departure from proposals made in Chomsky (1993, 1994), largely because we eliminate many of the principles assumed in that work, deducing similar but somewhat different entailments. However, our theory is entirely consistent with and owes much to the minimalist approach to linguistic theory from which it stems, insofar as we attempt to understand how much empirical coverage the seemingly simplest assumptions concerning universal syntax can achieve. We argue that these simplest assumptions do considerably more empirical and explanatory work than previously supposed. Examining the properties of independently motivated, universal, maximally simple, and arguably unifiable structure-building rules may allow us to dispense with both stipulative representational definitions of syntactic relations, and the notion of "levels" of representation.1
INTRODUCTION
15
Notes 1.
The theory proposed here also has interesting consequences with respect to theories of syntactic parsing. See, e.g., Berwick and Epstein (1995), Yang (1996a, 1996b).
1 The Derivation of Syntactic Relations
Alongside explicit specifications of "syntactic feature" and "permissible syntactic feature bundle" (i.e., a possible syntactic category), perhaps the most fundamental construct postulated within syntactic theory is that of a "syntactic relation." For example, the following are each considered to be core syntactic relations. (1) a. (Subject-Verb 3sg.) Agreement: b. (Object-Verb) Theta Relation: c. (Accusative) Case Relation: d. (Reflexive) Binding Relation: e. (Passive) Movement Relation: f. The "is a" Relation:
The dog kicks walls. The dog kicks walls. The dog kicks them. The dog kicks herself. The wall was kicked t. The dog [VP kicks [DP walls]] (kicks and walls together constitute, i.e., are, a VP).
Such relations are apparently very heavily constrained; that is, we do not find empirical evidence for all of the logically possible syntactic relations. Rather, at least from the perspective of unified theories (to be discussed momentarily), we find only one, or perhaps a few, to be distinguished on a principled basis. Given that syntactic relations are very heavily constrained, the questions we confront include: "What are they?" "How are they to be formally expressed?" and, more deeply, "Why do we find these and not any of the infinite number of logically possible others?" Within the framework of Chomsky (1981, 1982), there are, by hypothesis, at least two fundamental syntactic relations. The first, the 16
The Derivation of Syntactic Relations
17
"Government" relation (see also Chomsky 1986) isa unified construct, a binary relation under which all the seemingly disparate phenomena illustrated in the English examples ((la-e), but interestingly not (If)) are to be captured. By contrast, the "is a" relation (If) was not a Government relation, but rather a relation created by the base component, upon which Government relations were defined. For example, the trinary relation in (If) "V and DP are a VP" is not a Government relation. This type of theory of syntactic relations arguably confronts a number of conceptual (and empirical) problems: A. Unification Unification is, in a sense, precluded in that the "is a" relation is divorced from the Government relation. Government relations are defined on base-generated representations already exhibiting "is a" relations. Why isn't there just one relation? If there is not, why are (la-e) (these 5 cases) unified under Government, with (If) (the "is a" relation) the one left out? B. Explanation Explanation at a certain level is lacking to the extent that the (or one of the) fundamental relations, Government (Chomsky 1986a), or more recently "Minimal Domain" (Chomsky 1993) a binary relation defined on representations, is merely a definition. Hence true explanation is lacking at this level—that is, the following is unanswered: Why is Government as defined in (3) or Minimal Domain the fundamental syntactic relation, rather than any of the infinite number of other logically possible, syntactically definable relations? C. Primitive Constructs Government and Minimal Domain are not in fact primitives in that they each incorporate a more fundamental binary relational construct, command. Thus Government and Minimal Domain are not just (unexplained) definitions; they are in fact, contrary to standard assumption, not the fundamental unexplained definition. Rather, the relation command is. Of course, if command is to express a fundamental syntactic relation, and it remains an
18
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
(unexplained) definition, then, in the exact same sense as in (B), explanation is lacking: Why is this relation, so defined, syntactically significant? D. Complexity Government and Minimal Domain definitions are complex. Of course, this claim has no substance in the absence of an explicit principled complexity metric. Hence, we will leave it to the intuition of the reader that the alternative theory of syntactic relations proposed in this book achieves significant simplification and (we hope) exhibits no associated loss (and perhaps even a gain) in empirical adequacy. In this chapter we will address each of these four closely related problems confronting the fundamental construct "syntactic relation" as it is expressed in contemporary syntactic theories. The analysis will be couched within the Minimalist Program (outlined in Chomsky 1993 and later developed in Chomsky 1994, 1995). Importantly, the minimalist approach exhibits the following innovations: 1. D-structure is eliminated and, along with it, the bifurcation of the D-structure-generating base component and the transformational component. 2. Generalized Transformations (i.e., Merge), arguably unifiable with the Singulary Transformations (== Move-a), as proposed in Kitahara (1994) and in chapters 3-5, are reinstated. The central hypothesis we will propose here can be expressed in an informal (inexact) and preliminary fashion as follows: (2)
Preliminary Hypothesis
a.
The fundamental concept "syntactic relation"—for example, "Government" or "Minimal Domain"—is not an unexplained definition defined on representations (i.e., "already built-up" phrase-structure
The Derivation of Syntactic Relations
19
representations). Rather, syntactic relations are properties of independently motivated, simple, and minimal transformations. That is, syntactic relations are established between a syntactic category X and a syntactic category Y when (and only when) X and Y are transformationally concatenated (thereby entering into sister-relations with each other) by either Generalized Transformation or Move-a during the tree-building, iterative, universal rule-application which constitutes the derivation. b.
The fundamental structure-building operation is not Move-a (Chomsky 1981, 1982) or Affect-a (Lasnik and Saito 1992) or Attract-a (Frampton personal communication) but, rather, "Concatenate X and Y, thereby forming Z."
The analysis we will propose is natural in that concatenation, and only concatenation, establishes syntactic relations between categories. Given this hypothesis, and given that any derivational syntactic system requires some concept of concatenation, it follows that the fundamental construct "syntactic relation" should fall out—that is, it will be deducible from the independently and strongly motivated notion of concatenation as expressed by the rules Merge and Move in the Minimalist Program. To the extent that we are correct that "syntactic relation" is, contra most contemporary theories of syntax, an explicable derivational construct, not a representational definition, the four central and forbidding obstacles A-D that confront current syntactic explanation can be overcome. Before proceeding, we would like to briefly place this hypothesis—that fundamental, representational, unexplained definitions can be replaced by derivational explanations—in a broader historical context. As is well known, the "rule-less," representation-based P&P theory evolved from earlier rule-based systems. The construction-specificity and language-specificity of the phrase-structure and transformational rules postulated represented a serious obstacle to explanatory adequacy: "How does the learner select this grammar with these rules, based on exposure to degenerate data?" An entirely natural development was the gradual abandonment of rule-
20
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
based grammars and the concomitant postulation of universal constraints on representations or "principles" (expressing the properties common to rules) which were consequently neither construction nor language particular entities. The residue—namely, the language-specific properties of rules, to be fixed by experience—were ascribed the status of parameters with the hope that constructionspecificity would be altogether eliminated from core grammar. While the abandonment of rule systems and the adoption of principles (i.e., filters or well-formedness conditions on rule-generated representations) was an entirely natural development, there is an alternative, which we believe is reflected (perhaps only implicitly) in Chomsky (1991, 1993, 1994, 1995). The alternative is this: given that it was the language-specificity and constructionspecificity of rules, and not the fact that they were rules per se that apparently threatened explanatory adequacy, an alternative to the postulation of principle-based theories of syntax is to retain a rule-based framework but eliminate from the rules their language-particular and construction-particular formal properties. That is, instead of universal principles (such as constraints on representations as found in Binding Theory, Case Theory, X Theory, Theta Theory, and so on), an alternative is to postulate universal rules, thereby maintaining a "strongly derivational" theory of syntax, like Standard Theory in that it incorporates iterative application of rules but unlike Standard Theory in that the rules are "universalized" as are Generalized Transformation (Merge) and Move (Chomsky 1993, 1994). Rules are then purged of all language-particular or construction-specific properties: apparent cross-linguistic differences are attributed, by hypothesis, to morphological variation that affects the possible application of those rules. In this chapter, we argue that this strongly derivational universal-rule approach, in which iterative rule-application characterizes syntactic derivations while constraints on output levels of representation (hence levels themselves) are altogether eliminated (see Chomsky 1994, 1995), exhibits explanatory advantages over the existing, representational, "rule-free," principle-based (hence representation-based) theories — at least in the domain of accounting for the absolutely central construct "syntactic relation."
The Derivation of Syntactic Relations
21
1.1. Syntactic Relations in Principle Based Theory In the pre-minimalist, principle-based framework of Chomsky (1986), representations (phrase-structure trees) are freely built by an unconstrained implicit base. The output is constrained by the X schema, a filter on output representations. The unifying construct Government is a binary, unidirectional, and asymmetric syntactic relation holding between two syntactic categories in a derived representation. The unifying syntactic relation, Government, is defined as follows: (3)
a.
Government X governs Y iff i. X M-commands Y, ii. There is no Z, Z is a barrier for Y, such that Z excludes X.
b.
M-command X M-commands Y iff the minimal maximal projection dominating X dominates Y (see Aoun and Sportiche 1983).
c.
Excludes X excludes Y iff no segment of X dominates Y.
d.
Dominates X dominates Y only if every segment of X dominates Y.
and
e.
Barrier Z is a barrier for Y iff i. Z immediately dominates W, W a blocking category for Y, ii. Z is a blocking category for Y and Z is not IP.
f.
Immediately Dominates A maximal projection X immediately dominates a maximal projection Y iff there is no maximal projection Z such that X dominates Z and Z dominates Y.
or
22
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
g. and h. i. and and
Blocking Category Z is a blocking category for Y iff i. Z is not L-marked, ii. Z dominates Y. L-marks X L-marks Y iff X is a lexical category that theta-governs Y. Theta-Government X theta-governs Y iff i. X is a zero-level category, ii. X theta-marks Y, iii. X and Y are sisters.
While such an approach constitutes an impressive, highly explicit, and unified analysis of a number of seemingly very disparate syntactic phenomena, it arguably suffers from the four problems noted in A-D above. First, it is not wholly unified since the "is a" relation is not a Government relation. Second, "Government" is a definition, hence it is entirely unexplained why syntactic phenomena, by hypothesis, conform to this particular relation and not to any of the other infinite alternative and readily definable relations. Third, "Government" is not really a primitive relation since it incorporates the more primitive relation "M-command" (3a.i, 3b). Fourth, the definition of "Government" is, arguably, "complex" (though see the previous section of this chapter for discussion of the unclarity of such a claim). Let us begin by addressing the question of what is, by hypothesis, primitive. Following Chomsky (1993:fn 9), we will assume that M-command in fact plays no role. We will, however, assume that C-command is indeed a primitive. In the next section, we will attempt to show that contrary to all syntactic analyses since Reinhart (1979) and including, most recently, Kayne (1994), Ccommand need not be expressed as an unexplained, representational definition but can instead be expressed as a natural explicable derivational construct, assuming Chomsky's (1993, 1994, 1995) elimination of a distinct base component (and along with it, the elimination of a base-generated deep structure level of repre-
The Derivation of Syntactic Relations
23
sentation), and the postulation of a syntactic component in which derivations are characterized by iterative, bottom-up application of universal, simple, and perhaps unifiable rules, Merge (Generalized Transformation) and Move (Singulary Transformation).
1.2. C-Command In this section we review the representational definition of C-command and provide arguments that a derivational definition eliminates a massive redundancy in the system, bringing us closer to an explanation of C-command. 1.2.1. Representational C-Command Consider the following representational definition of C-command: (4)
A Representational Definition of C-command A C-commands B iff: i. The first branching node dominating A dominates B, and ii. A does not dominate B, and iii. A does not equal B. (Reinhart 1979)
The first thing to consider is that (4) constitutes a definition. If this is where we begin our analysis, then we begin at a point where explanation is lacking. In other words, we have no answer to the question: "Why is this particular binary relation syntactically significant?" The definition itself provides no particular insight into the significance of this relation above any other imaginable one. Our task is to formulate a means of construing C-command in such a way that the definition (4) is explained, either via a more natural definition or by some demonstrations that no definition is needed at all. As an illustration, consider (5):
24 (5)
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
A Schematic Illustration of C-command
In (5), Spec IP = Da C-commands I', I N F L w i l l , VP, Viike, D, Dthe, N move , and nothing else. The question is why. Note that it is exactly as if the other categories in (5) are, with respect to Da, (inexplicably) invisible; hence Da enters into no relations with these other categories. That is, Da C-commands none of these others (although Da is C-commanded by some). We should also note some other important properties of C-command. First, by being a definition, it is nonexplanatory. Second, it is pervasive and fundamental, apparently playing a unifying role throughout the different subcomponents of the syntax. Third, it is persistent: despite substantive changes in the theory of syntax, Reinhart's definition, proposed almost two decades ago, remains linguistically significant. Fourth, as noted, it is representational—that is, it is a relation defined on representation.
The Derivation of Syntactic Relations
25
The unanswered questions confronting C-command are thus, at least, the following:
(6) a. Why does it exist at all? Why doesn't A enter relations with all constituents in the tree? b. Why is the, first branching node relevant? Why not "the first or second or third (or n th ) node dominating A must dominate B?" c. Why is branching relevant? d. Why doesn't A C-command the first branching node dominating A, but instead C-commands only categories dominated by the first branching node? e. Why must A not dominate B? f. Why must A not equal B?
Thus we see that one of the most fundamental unifying relations is expressed as a purely stipulated representational definition. The hypothesis we will advance is that the properties of C-command just noted are not accidental but are intimately related. First, we believe it is fundamental, pervasive, and persistent because it is a natural syntactic relation. Second, we propose that it is stipulative and nonexplanatory precisely because it has been formulated as a representational relation. Third, we will propose that C-command is in fact derivational—a relation between two categories X and Y established in the course of a derivation (iterative universal ruleapplication) when and only when X and Y are paired (concatenated) by transformational rule, either Merge or Move. Ultimately, Ccommand is & property of derivations. Construed derivationally, the unanswered questions confronting the representational definition will receive natural answers.
26
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
1.2.2. The Derivation of C-Command To begin with, we will assume the following: (7) a. Merge and Move (Chomsky 1993, 1994) are at least partly unifiable (as proposed in Kitahara, 1994, 1995, 1997; Groat 1995 a, 1997) in that each pair (= Concatenate) exactly two categories, "A" and "B", rendering them sisters immediately dominated by the same (projected) mother "C" (where C = the head of A or of B (Chomsky 1994, 1995)). b. Given (la), there is a fundamental operation, common to or shared by both Merge and Move alike, namely, Concatenate A and B forming C (C = the head of A or of B).1
Crucially, what the universalized transformational rales Merge and Move each do, by hypothesis, is establish a syntactic relation between two concatenated syntactic categories A and B by virtue of placing the two in the "is a" relation with C, the projected category. We will also assume for the time being that Merge applies cyclically (and that Move does so as well). Consider, for example, the following derivation: (8)
Merging Vlikes and Dit yields, informally:
The Derivation of Syntactic Relations
27
The "lower"Vlikes(= A) and Dit (= B) are, by virtue of undergoing Merge, in a relation: they are "the sister constituents of a V-phrase/ projection C, labeled Vlikes." Thus what Merge does is create sisters—that is, it concatenates exactly two categories A and B, and projects their mother C (C the head of A or of B). Crucially, A and B cannot be sisters without having a common mother. Conversely, if nonbranching projection is disallowed, and only binary branching is permitted, then there cannot be a mother C without exactly two daughters (the sisters A and B). In a nutshell, both the sisterhood relation and the motherhood relation (the latter, the "is a" relation) are simultaneously created in one fell swoop, by a single Merge application. Thought of in terms of Standard Theory transformations, Vlikes and D,Y constitute the Structural Description of Merge. The Structural Change (perhaps deducible, given the Structural Description; see Chomsky 1995) specifies the categorical status of the mother or output tree/set. Thus invariably the two entities in the Structural Description are rendered sisters, that is, they are placed in the "is a" relation to the projected (perhaps predictable) mother C, all of this internal to a single Merge application. Consequently, there is conceivably no need for a representational definition of the "is a" relation, since these two relations are clearly expressed (and unified) within the independently motivated, universal structure-building rules themselves. Representational definitions would therefore be entirely redundant, and, as definitions, nonexplanatory.2 The tree in (8) is formally represented as {Vlikes {V likes, D it }}. This object consists of three "terms": (9)
a. The entire tree/set (= C) V likes(= A ) c. Dit (= B) b.
That is, following Chomsky (1994, 1995), we assume: (10)
a.
Definition of term ("constituent") For any structure K, i. K is a term of K (the entire set or tree is a term),
28
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
and
b.
ii.
if L is a term of K, then the members of the members of L are terms of K. (Chomsky 1994, p. 12)
The terms in (8): i. K = {V likes , [Vlikes, D it }} = one term ii. K has two members: member 1 = Vlikes = "the label" member 2 = a 2-membered set = {V likes , Dit} iii. Ml and M2 = Members of a member—that is, each is a member of member 2 of K. Therefore, each is a term.
Thus "terms correspond to nodes of the informal representations, where each node is understood to stand for the subtree of which it is the root" (Chomsky 1994, 1995).3 Continuing with the derivation, suppose that concurrent with the construction of (8), we construct the following separate phrasemarker (recall that separate phrase-markers may be constructed in parallel, as long as a single phrase-marker results by the end of the derivation; see chapter 5; see also Collins 1997): (11)
Merge Dthe and Ndog yielding informally:
The tree is formally represented as {Dthe, {Dthe, N dog }}, similarly consisting of three terms: the entire two-membered set and each of the two categories that are members of a member of the twomembered set (namely, Dthe and N dog ).Now, having constructed the two three-membered trees in (8) and (11), suppose we merge
The Derivation of Syntactic Relations
29
these two, yielding (12) ("a," "b," and "c" are purely heuristic: Da = Dthe, and Vb and Vc are both Vlikes):
Now notice that there exists a massive redundancy: the representational definition of C-command in (4) stipulates C-command relations between sisters in the derived representation (12). But sisters are precisely the objects A and B which invariably undergo Merge in building the representation, thus:
(13) i.
In (12): a. Dthe representationally C-commands Ndog: they were merged. b. Ndog, representationally C-commands Dthe: they were merged.
ii.
a. Vlikes representationally C-commands Dit: they were merged. b. Dit representationally C-commands Vlikes: they were merged.
iii. a. b.
Da representationally C-commands Vb: they were merged, Vb representationally C-commands Da: they were merged.
iv.
Vc representationally C-commands nothing: it has not been merged with any category.
v.
In (12), the ten binary Dominance relations: no two categories that are in this relation were merged.
vi.
No category representationally C-commands itself: no category is (by pure stipulation in (4c)) merged with itself.
30
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
Thus we see that Merge, an entirely natural and independently motivated structure-building operation (i.e., transformational rule), seems to capture certain representational C-command relations: that is, if X and Y are concatenated, they enter into what we have called "C-command relations." Consequently, it would seem that we could eliminate the stipulated, unexplained representational definition of "C-command" (4) with respect to these cases, since the relation is expressed by independently motivated transformational rule. There is, however, a problem with this suggestion: when Merge pairs two categories, this establishes only symmetrical (reciprocal) C-command relations. Consider example (14): (14)
The arrows in (14) each indicate desired C-command relations. But Merge does not totally subsume the representational definition of C-command, precisely because there exist C-command relations between two categories that were not merged. Thus (14'a) is true, but (14'b) is false: (14') a. If A and B were merged, then A C-commands B and B Ccommands A. b. If A C-commands B, then A and B were merged.
The Derivation of Syntactic Relations
31
To see the falsity of (14'b), consider the C-command relations that we desire to obtain in a structure like (15): (15) Da and Vb are merged:
The specifier of Vlikes' Da C-commands the head Vlikes and the complement D!Y, but Da was merged with neither the head Vlikes nor the complement Dit As a solution to this problem confronting our attempt to entirely deduce representational C-command from Merge, we make the following observation: notice that although Da was not merged with Viikes nor with Dit, Da was merged with Vb. But now recall that Vb = {V likes ={ v likes, D i t } } consists of three terms: (1) {Vb [Vlikes, D i t }} (the whole Vb subtree in (15)), (2) Viikes, and (3) Dit. Given that a syntactic category is a set of terms (in Dominance/precedence relations), e.g. Vb consists of three terms, we can propose the following, natural derivational definition of C-command: (16) Derivational C-Command (Preliminary Version) X C-commands all and only the terms of the category Y with which X was merged in the course of the derivation. Thus Da C-commands Vb and all terms of Vb. Now, recall that Move, the other structure-building operation, also pairs/concatenates exactly two categories, projecting the head of one, and in this respect is identical to Merge (Kitahara 1993, 1994, 1995).4 There-
32
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
fore, since "is a" relations are created by Move in the same manner as they are created by Merge, we can now propose: (16') Derivational C-Command X C-commands all and only the terms of the category Y with which X was paired/concatenated by Merge or by Move in the course of the derivation.
Given (16'), consider the case of Move in (17):
Applying Reinhart's (1979) definition in (4) to the representation (17), Dthe representationally C-commands the five categories (subtrees) Y, INFLwas, V',V'arrested, Dtrace and nothing else. But this unexplained state of affairs is explicable derivationally: Dhe was paired/concatenated (in this case by Move), with Y, and Y is a
The Derivation of Syntactic Relations
33
five-term category/tree/set consisting of precisely I', INFLwas, V, V'arrested and Dtrace. It is entirely natural then that, since Dhe was paired with a 5-term object, and pairing/concatenation is precisely the establishment of syntactic relations, Dhe enters into a relation (what has hitherto been called C-command) with each of these five terms and with nothing else. Notice, given this analysis, a certain (correct) asymmetry is also captured. While it follows that Dhe, as just noted, C-commands each of the five terms of I', the converse is not true—that is, it is not true that each of the five terms of I' C-commands Dhe. For example, INFLwas. is a term of Y, but INFLwas does not C-command Dhe', rather, since in the course of the derivation INFL was paired, this time, by Merge, with V, our analysis rightly predicts that INFLwas C-commands each of the three terms of V, namely, V itself, Varrested, Dt, and nothing else. Given the derivational definition of C-command (16'), we can now answer questions that were unanswerable given the representational definition of C-command (4). (18) a. Q: Why is it that X C-commands Y if and only if the first branching node dominating X dominates Y? A: It is the first, (not, e.g., the fifth, sixth, or nth node) that appears relevant since this is the projected node created by pairing of X and Y as performed by both Merge and Move. b. Q: Why doesn't X C-command the first branching node dominating X, but instead only the categories dominated by the first branching node. A: X was not paired with the first branching node dominating X, by Merge or by Move. c. Q: Why is branching node relevant? A: Assuming bare phrase-structure (Chomsky 1994), no category is dominated by a nonbranching node: Free Projection (as in Chomsky (1993)) is eliminated. Structure Building (Merge and Move) consists of pairing; hence it invariably generates binary branching .
34
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
d. Q: Why must X not equal Y—that is, why doesn't X Ccommand itself? A: Because X is never paired/concatenated with itself by Merge or by Move. e. Q: Why is it that in order for X to C-command Y, X must not dominate Y? A: If X dominates Y, X and Y were not paired by Merge or by Move.
Thus we propose that pairing/concatenating X and Y, by application of the universal transformational rules Move and Merge, expresses syntactic relations such as C-command. We have, thus far, provided what we believe to be strong explanatory arguments for the derivational construal of C-command proposed here. However, since we have thus far sought only to deduce the empirical content of representational C-command, we have not provided any arguments that representational C-command is empirically inadequate. We will now provide one just such argument, suggesting that representational C-command should be abandoned, because it is inconsistent with an independently motivated hypothesis. By contrast, derivational C-command will be shown to display no such inconsistency. The argument stems from Chomsky's notion of the "invisibility" of intermediate-level categories, or "X' invisibility." Consider again a tree such as (19): (19)
(Vb and Vc each = Vlikes; Da = Dlhe)
The Derivation of Syntactic Relations
35
Recall that in the input to—that is, the Structural Description of—Merge, there were two categories: (20)
a. Da = 3 terms: 1 .'b,, itself (K is a term of K, see (10)) 2. Dathe ,
3.Ndog,
b. Vb = 3 terms: 1. Vb itself 2. Vlikes
3.Dit. Given that Da and Vb were merged, derivational C-command (16b) entails: (21)
Da C-commands: Vb, Vlikes, and Dit V, C-commands: Da , Dthe, and Ndog
But assuming a relational analysis of a syntactic category's phrasestructure status (Muysken 1982, Freidin 1992), in the representation (19) Vb, being neither a minimal nor a maximal projection of V (the verb), is not a term (or is an "invisible term") of (19) (Chomsky 1995). Therefore, Vb is "stricken from the record" in (21)—that is, it is not a C-commander at all. Consequently, Kayne' s (1994) reanalysis of Spec as an X'-adjunct is not required for LCAcompatibility, exactly as Chomsky (1995) proposed. Nor is V b =X' C-commanded by any category. Thus in the informal representation (19), we have only the following relations, a proper subset of those in (21): (22) a. Da asymmetrically C-commands Vlikes. and Dit b. Dthe symmetrically C-commands Ndog
c. Vlikes symmetrically C-commands Dit
These are, by hypothesis, the desired results. However, given X' invisibility, in the resulting representation, neither Vlikes nor Dit is a member of some term (other than themselves) which excludes Spec VP. Although the term Vb dominates Vlikes and Dit, it is by
36
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
hypothesis invisible; thus there is no visible term which dominates Vlikes and Dit, and also excludes Spec VP (= Da). Thus under the representational definition, these two categories should C-command Spec VP, not at all the desired result. In direct contrast to representational C-command, the derivational definition of C-command yields the correct results. Since Vlikes and Dit were Merged with each other, derivational C-command (16') entails that they C-command each other and nothing else. Notice that Vlikes and Dit were, at one derivational point, members of a term, namely, Vb,. This term did not, at the point in the derivation immediately following its construction, include any term as a specifier. Thus Vlikes and Dit do not derivationally Ccommand the specifier, since at no point was either of these terms merged with a category containing the specifier.5 This suggests that the derivational construal of C-command has a crucial empirical advantage over the standard representational definition, if we adopt the X'-invisibility hypothesis: the representational definition of C-command wrongly predicts that categories immediately dominated by a representationally invisible single-bar projection (e.g., the complement) C-command the specifier, and (worse yet) all members of the specifier. Of course, some modification of the representational definition of C-command could be devised to accommodate X'-invisibility, overcoming the problem without the derivational definition of C-command. But this would amount to a simple restipulation of the definition, a convenient but arbitrary choice among the infinite variety of possible definitions of intercategorial relations. 1.3. Discussion So far, we have proposed a syntactic theory in which C-command is not formally expressed as an unexplained representational definition. We have proposed instead that such syntactic relations are derivational constructs, expressed by the formally simple ("virtually conceptually necessary;" Chomsky 1994) transformational rules, Merge and Move, each motivated on independent grounds in Chomsky (1993).
The Derivation of Syntactic Relations
37
This new theory of syntactic relations proposed here, seeking ultimately to eliminate central representational definitions such as "Government," "Minimal Domain" and "C-command," is entirely natural and, we think, explanatory: concatenation operations are by hypothesis a necessary part of the syntax—that is, there must exist some concatenative procedure (the application of which, by hypothesis, yields representations of sentences). But while there must be concatenative operations of some sort, it is not the case that in the same sense there "must" be principles—that is, filters or wellformedness conditions on representation. The question we have sought to investigate here is thus: "Are the simple, independently-motivated, and virtually conceptually necessary structure-building operations themselves—specifically, the universalized transformational rules Merge and Move, iteratively applied— sufficient to capture (the) fundamental syntactic relations?" The tentative answer is that they seem to be. If they are, a theory of syntax expressing this will, as a result, attain a much more unified, nonredundant, conceptually simple, and correspondingly explanatory account of what is a most fundamental syntactic construct, that of "syntactic relation," known in advance of experience by virtue of what is, by hypothesis, a human biological endowment for grammar formation. The derivational definition of C-command (16') proposed here eliminates massive redundancy (see (13)), provides principled answers to an infinite number of unanswered questions confronting the definition of representational C-command (see (18)), and also overcomes empirical inadequacies noted previously that result from the interaction of the X'-invisibility hypothesis (Chomsky 1993) and representational C-command (Reinhart 1979). Moreover, the derivational definition is an entirely natural subcase of a more general hypothesis (explored later): namely, all syntactic relations are formally expressed by the operation Concatenate A and B (= the Structural Description) forming C (= the Structural Change) common to both the structure-building operations (transformational rules) Merge and Move. Thus what Merge and Move do is establish relations including the "is a" relation and the C-command relation by virtue of concatenating categories. Nonetheless, despite its very significant advantages over represen-
38
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
tational C-command, the derivational definition is still just that, a definition (albeit a very natural one). As it is a definition, we must ask why it obtains. Thus we still have not answered at least one very deep question confronting the derivational approach, namely, (23): (23)
Why does C-command exist at all? Why doesn't a category A simply enter into relations with all constituents in the tree?
The derivational definition of C-command does not answer this question; rather, it simply asserts that X enters into C-command relations with all and only the terms of the category with which it is transformationally concatenated. Additionally, we have yet to explain the nature of more local relations, such as the Head-Complement relation, and the SpecHead relation, which like C-command have proven to be extremely useful postulates in theories of theta-assignment, predication, Agreement, and Checking relations. In the next section we will approach the matter of an explanation for the derivational definition of C-command (a formalization for which will be presented in chapter 6). The more local relations are the topic of chapters 3and 4. 1.4. Toward a Deduction of the Derivational Definition of C-Command Let us begin by considering the case of two categories such that neither C-commands the other:
The Derivation of Syntactic Relations
39
In (24), Dthe and Dit, are such that neither C-commands the other, illustrating the generalization that members of Spec do not C-command X' members, and X' members do not C-command members of Spec. The first conjunct of this generalization is illustrated by, for example, the Binding violation: (25)
*[spec This picture of John] [x- upsets himself.]
The derivational definition (16') correctly entails that John fails to C-command himself in (25). But the nonexistence of such Ccommand relations is, perhaps, deducible. Consider what we shall call the First Law: the largest syntactic object is the single phrasestructure tree. Interestingly, this hypothesis is so fundamental, it is usually left entirely implicit. The standard, that is, representational, construal can be stated as follows: (26)
The First Law (Representationally Construed) A term (= tree, category, constituent) T3 can enter into a syntactic relation with a term T2 only if there is at least one term T3 of which both T1 and T2 are member terms.
Informally, by the most fundamental definition of "syntax," there are no syntactic relations from one tree to another distinct tree; that is, the laws of syntax are "intra-tree" laws: X and Y can enter into syntactic relations only if they are both in the same tree.6 As a minimal assumption, a category should enter into a relation with all other categories in the same tree. Now note that in (24), the merger-derived representation, there is indeed a tree (= the entire tree in (24)) such that Dthe (member of Spec) and Dit (the complement) are both in it. But as shown previously (see (12)), prior to cyclic merger, Da, the Spec tree and the Vb,tree were literally two unconnected trees. By the definition of "syntax" there can be no relation, including C-command, between members of two unconnected trees. To capture this, we propose that we reformulate the implicit First Law as a derivational, not a representational, law:
40
(27)
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
The First Law (Derivationally Construed) T1can enter into C-command relations with T2 only if there exists no derivational point at which: i. T1 is a proper subterm of K1, and ii. T2 is a proper subterm of K2, and iii. there is no K3 such that K1 and K2 are both terms of K3.
Informally, there are no relations between members of two trees that were unconnected at any point in the derivation. In the derivation of (24), assuming cyclicity, there was necessarily a derivational point at which Dthe was a member of Da/Spec and Dit was a member of Vb, but there did not yet exist a tree containing both the branching Da tree and the Vb tree. Therefore, it follows from the derivational construal of the First Law, perhaps the most fundamental law of syntax, that there is no relation between Dthe and Dit. More generally, there are no relations between members of Spec and members of X'. We thus at this point partially derive fundamental syntactic relations like C-command and entirely derive the nonexistence of an infinite number of logically possible but apparently nonexistent syntactic relations, each of which is representationally definable. Notice we do so with no stipulations, no "technicalia," nothing ad hoc, but by appeal only to The First Law, derivationally construed. Notice, crucially, that in (28), the two merged trees, namely, Da and Vb themselves, can enter into syntactic relations even though at one derivational point they were unconnected:
The Derivation of Syntactic Relations
41
In this case, (27) entails that since neither Da nor Vb is a proper member of some term—that is, each of them is a root node, neither has undergone Merge or Move; hence each is (like a lexical entry) not yet a participant in syntactic relations. This is formalized in (27) by the notion of a "proper subterm." This is a crucial point, for which further justification will be found in chapter 6.7 To summarize, given two nodes (trees/terms/categories) X and Y between which no C-command relation obtains, we do not need to stipulate representational C-command (4) to block the relations. In fact, we do not even need to appeal to the more natural derivational definition of C-command (16'). The derivational construal of the First Law is sufficient: there are no syntactic relations between X and Y if they were, at any derivational point, members of two unconnected trees. As a simple illustration, consider (29):
(29) *[[Spec This picture of John] [x- upsets himself.]]
This type of binding phenomenon now receives a very simple analysis. A reflexive requires an antecedent of a particular morphosyntactic type, by hypothesis an irreducible lexical property. "To have an antecedent" is to enter into a syntactic relation. However, the First Law, derivationally construed, precludes the reflexive from entering into any syntactic relation with the only morphosyntactically possible candidate "John" since, given cyclic merger, there existed a point in the derivation in which: (1) John was a member of Da (Spec), (2) himself was a member of Vb, and (3) Da and Vb were unconnected trees. This completes our discussion of the deduction of those aspects of the derivational definition of C-command pertaining to two categories X and Y where neither C-commands the other. Next consider the case of asymmetric C-command—that is, X C-commands Y, but Y does not C-command X. In (31), for example, the tree representation of (30), the specifier Djohn representationally C-commands the complement Dhimself, but not conversely:
42
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
(30) [spec John] [x- upsets himself.]
Assuming for simplicity a structure without functional heads, we have (31)
The generalization to be accounted for is that the specifier asymmetrically C-commands the complement. The derivation of (31) is as follows, given cyclic Merge:
The Derivation of Syntactic Relations
43
Notice that DJohn was never a member of some tree that did not contain Dhimself. Rather, Merge2 pairs/concatenates DJohn itself (a member of the Numeration) with a tree containing Dhimself. Thus, correctly, the First Law (27) allows, (i.e., does not block) a Ccommand relation from John (= T1 of (27)) to himself (= T2 of (27)). Such a relation is allowed by the First Law precisely because there were, in the course of the derivation, never two unconnected trees with one containing John and the other containing himself. In fact, notice that the First Law, a relationship blocker, is altogether inapplicable to this derivation since there never appeared two unconnected trees in this derivation. Rather, Merge1 merges two members of the Numeration (Vlikesand Dhimself), formally forming {Vlikes{Vlikes, Dhimself} }, while Merge2 merges yet another element of the Numeration, DJohn with this object. But since the First Law (a relationship blocker) is inapplicable, we now, in fact, confront a problem: all relations are now allowed, not just the empirically supported (C-command) relation from specifier to complement but also, incorrectly, a C-command relation from the complement to the specifier. That is, in the absence of any supplementary constraints (relationship blockers), the inapplicability of the First Law allows the complement to C-command the specifier. As a possible solution to this problem, first recall that in Chomsky's (1994, 1995) Minimalist Program, all concatenation/ pairing is performed by either Merge or Move. As we have claimed previously, what Merge and Move do, naturally enough, is express syntactic relations. Now, if the universal rules Merge and Move are the sole relationship establishers, and in addition apply cyclically, it is altogether natural that a relation between X and Y is established exactly at the derivational point at which X or Y is concatenated. This is the underpinning of the derivational construal of syntactic relations: that the relations are established at the point in the derivation at which terms are concatenated/paired. Given this, we now have a potential solution to our problem: a complement never bears any relation to (e.g., never C-commands) the specifier, because when a complement is transformationally introduced—for example, Dhimself in (32a)—the specifier does not yet exist. Thus a complement (and all the terms
44
ADERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
within the complement) invariably bear no relation to the specifier. This is simply because an entity X can never bear a relation to a nonexistent entity. Crucially, then, the matter of "timing" is the issue at hand: when a category X undergoes Merge/Move, it gets into a relation with everything in the tree with which it is concatenated. If a category Y isn't yet in the tree, the relation from X to Y does not arise. Hence the asymmetry of the relation parallels the asymmetry of the iterative derivational procedure. Thus derivational C-command and perhaps, more generally, the fundamental concept "syntactic relation" appear to be at least partially deducible from a derivational construal of the First Law (the "unconnected tree" law) and from derivational preexistence (X cannot bear a relation to Y where Y is nonexistent). Importantly, the empirical facts are explained by appeal only to the independently motivated formal properties of transformational rules, acting in concert with the quite fundamental, perhaps irreducible First Law, derivationally construed. We suspend a full and proper formalization of C-command with respect to the First Law until chapter 6. 1.5. Summary In this chapter we examined the representational definition of C-command and argued that the definition appears quite arbitrary and calls for explanation. Noting the redundancy between the application of transformational rules in the derivational of a phrase-marker K and the C-command relations that the representational definition predicts to hold of the terms of K, we hypothesized that the relation is a natural one when viewed from the standpoint of the derivation rather than the representational one. We then proposed a derivational definition of C-command, the character of which, we argued, is more natural than the representation one. Finally we considered the possibility that this definition might be eliminable, as it falls out from the First Law of Syntax construed derivationally.
The Derivation of Syntactic Relations
45
Notes 1. The term "concatenate" as used here is not to be construed in the mathematical sense of the term as an operation on strings. 2. However, we revise and generalize the notion of sisterhood in chapter 3, suggesting that it is not to be found exclusively internal to the operations Merge and Move but rather across derivations. We also revisit Dominance in chapter 6. 3. The definition of "term" is modified in chapter 2. 4. See chapters 3-6 on movement as "Remerge," which formally unites both Merge and Move as identical instances of pair/Concatenate. 5. Construing the very notion of X'-invisibility in a derivational manner, an interesting property emerges under the derivational approach to C-command. Note that at the point in the derivation at which Vb is merged with the specifier Da Vb is a maximal projection and hence should be "visible" and may itself enter into C-command relations (even though it could not thereafter enter into any new Ccommand relations or syntactic operations, since it is thereafter an intermediatelevel category). It is thus predicted that in general, an intermediate-level category is C-commanded by and C-commands only the category with which it is merged (i.e., a specifier). Thus intermediate categories are potentially not entirely invisible to syntactic relations, contra Chomsky (1994) under this approach; see also chapter 4, section 6 for more extensive exploration, including a proposal for the elimination of intermediate-level categories altogether. 6. By way of contrast, relations among categories in different trees might be the domain of discourse or pragmatic rules. 7. In a nutshell, we will see that the First Law as applied to the input of Merge/ Move yields the C-command relation, while the same law when applied to the output of Merge/Move yields the "term of," "is a," or Dominance relation. We may thus extend the notion of "syntactic relation" to include Dominance and will derive both C-command and Dominance as two sides of the same coin, each reflecting the First Law as applied to input and output of the transformational rules, respectively.
2 A Derivational Application of Interpretive Procedures
The Minimalist Program assumes that "languages are based on simple principles that interact to form often intricate structures" and that "the language faculty is nonredundant, in that particular phenomena are not 'overdetermined' by principles of language" (Chomsky 1993:2).1 Taking these assumptions as a mode of inquiry, this research program seeks a maximally simple design for language. Following this minimalist spirit, the linguistic levels are taken to be only those that are conceptually necessary, namely, PF and LF. Given that PF is a representation in universal phonetics (with no indication of syntactic elements or relations among them), LF is the only level where syntactic relations are expressed representationally. We might call this a "single-level" approach to syntactic relations (Chomsky 1993, 1994, 1995). In the preceding chapter, however, we demonstrated that the arguably most fundamental syntactic relations, namely, C-command relations, are predictable from the way LF representations are generated by the computational system CHL. Under this derivational approach, LF would be conceptually superfluous (and empirically inadequate), at least in determining C-command relations (Epstein 1994, 1995). In this chapter, we examine three problematic cases facing this single-level approach to syntactic relations. Each case concerns syntactic relations that cannot be simultaneously represented at any single level. These syntactic relations thus demonstrate that LF cannot be the only level representing syntactic relations, although other intermediate levels such as D-Structure and S-Structure have been eliminated under the minimalist assumptions. Given this ap46
Application of Interpretive Procedures
47
parent paradox confronting the single-level approach to syntactic relations, we depart from the minimalist conception of linguistic levels and develop a form of the derivational model of syntax, in which the derivational process not only determines syntactic relations but also provides information directly to the interface systems. Under this derivational model, the syntactic relations in question are deduced from the way these relations are created by CHL; furthermore, upon creation, these relations enter into the interpretive procedures without mediation of linguistic levels.2 2.1 Binding Relations and Reconstruction Asymmetries This section briefly reviews a minimalist analysis of binding relations (presented in Chomsky 1993), with particular attention to reconstruction asymmetries. First consider the following example: (1)
John wondered [which picture of Bill] he saw t
In (1), he cannot take Bill as antecedent. Under the minimalist conception of linguistic levels, LF is the only level representing syntactic relations for the interpretive version of binding theory, under which the indexing and interpretive procedures are unified along with the binding conditions themselves (Chomsky 1993, Chomsky and Lasnik 1993). Consider the following three conditions (where D is the relevant local Domain):3 (2)
A: If a is an anaphor, interpret it as coreferential with some Ccommanding phrase in D. B: If a is pronominal, interpret it as disjoint from every C-commanding phrase in D. C: If a is an r-expression, interpret it as disjoint from every Ccommanding phrase.
Given that the interpretive version of binding theory applies solely at the LF level, the interpretation of (1) suggests that Bill contained in the landing site of the moved wh-phrase is interpreted as if it were contained in the departure site of the moved wh-phrase. That is, reconstruction is obligatory in (1).
48
ADERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
To generate an LF representation of (1) (to which Condition C applies), Chomsky (1993:35) proposes the copy theory of movement, under which a trace left by movement is a complete copy of the moved category. Given the copy theory of movement, (1) is assigned the following intermediate structure: (3)
John wondered [which picture of Bill] he saw t( which picture of Bill)
Chomsky (1993:41) then proposes the Preference Principle, under which the restriction in the operator position (e.g. specifier of C) must be minimized when possible.4 Given the Preference Principle, (3) is converted to the following LF representation: (4)
John wondered [which x] he saw [x picture of Bill]
Given the LF representation (4), in which he C-commands Bill, Condition C interprets Bill as disjoint from he, thereby predicting the interpretation of (1). Now compare (1) with (5): (5)
John wondered [which picture of himself] Bill saw t
In (5) himself can take either Bill or John as antecedent. Given that the interpretive version of binding theory applies solely at the LF level, the interpretation of (5) suggests that himself can be interpreted either in the departure site of the moved wh-phrase or in the landing site of the moved wh-phrase. That is, reconstruction is not obligatory (but rather optional) in (5). To explain this reconstruction asymmetry, exhibited by (1) and (5), Chomsky (1993:40) adopts the LF movement approach to anaphora, under which an anaphor (or part of it) must occupy a position "sufficiently near" its antecedent at LF (see also Chomsky 1986). Given the LF movement approach to anaphora, (5) can yield two distinct LF representations, depending on whether anaphormovement applies to the operator phrase or its trace. Let us examine the relevant aspects of these two derivations of (5). Under the copy theory of movement, (5) is assigned the following intermediate structure:
Application of Interpretive Procedures (6)
49
John wondered [which picture of himself] Bill saw t( which picture of himself)
The next step is to apply anaphor-movement. There are two possibilities: CHL can extract an anaphor (or part of it) out of the departure site of the moved wh-phrase or out of the landing site of the moved wh-phrase. Suppose that the first option is selected. Then CHL constructs the following intermediate structure: (7)
John wondered [which picture of himself] Bill self-saw t(which picture of tself )
The structure in (7) is then converted to the following LF representation with minimization of the restriction in the operator position: (8)
John wondered [which x] Bill self-saw [x picture of tself]
Given the LF representation (8), in which the relevant part of himself is "sufficiently near" Bill (but not John), Condition A interprets himself as coreferential with Bill.5 Suppose that the second option is selected. Then CHL constructs the following intermediate structure: (9)
John self-wondered [which picture of tself] Bill saw t(which picture of himself)
The structure in (9) is then converted to the following LF representation without minimization of the restriction in the operator position: (10) John self-wondered [which x, x a picture of tself] Bill saw x
Given the LF representation (10), in which the relevant part of himself is "sufficiently near" John (but not Bill), Condition A interprets himself as coreferential with John. In mapping (9) to (10), the restriction in the operator position cannot be minimized because that operation would break the
50
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
anaphor-chain. Consider the following LF representation resulting from such minimization: (11) John self-wondered [which x] Bill saw [x picture of himself]
In the LF representation (11), the raised anaphor is left without a 0role: a violation of the 0-Criterion.6 Given that "possibility" in the definition of the Preference Principle is understood relative to other principles such as the 9-Criterion, the minimization of the restriction, inducing a violation of the 6-Criterion, is impossible. Given that (5) yields either the LF representation (9) or the LF representation (10), Condition A predicts the ambiguity of (5). The reconstruction asymmetry, exhibited by (1) and (5), falls under the minimalist assumptions (presented previously). Let us turn to another reconstruction asymmetry. Consider (12a-b): (12) a.* [which claim [that John was asleep]] was he willing to discuss t b. [which claim [that John made]] was he willing to discuss t
In (12a) he cannot take John as antecedent, whereas in (12b) he can take John as antecedent. That is, reconstruction is obligatory in (12a), but not in (12b).7 To explain this reconstruction asymmetry, exhibited by (12a) and (12b), Chomsky (1993:36) appeals to the following distinction concerning the introduction of arguments and adjuncts (see, among others, Riemsdijk and Williams 1981, Freidin 1986, and Lebeaux 1988,): (13) The introduction of arguments must be cyclic, whereas the introduction of adjuncts can be cyclic or noncyclic.
Given (13), the complement clause in (12a), being an argument, must be introduced cyclically, whereas the relative clause in (12b), being an adjunct, can be introduced either cyclically or noncyclically.8 Let us examine how (13) and the minimalist assumptions interact to capture the reconstruction asymmetry, exhibited by (12a-b).
Application of Interpretive Procedures
51
We begin by considering the relevant aspects of the derivation of (12a), in which the complement clause must be introduced cyclically. Under the copy theory of movement, (12a) is assigned the following intermediate structure: (14) [which claim [that John was asleep]] was he willing to discuss t (which claim [that John was asleep])
The structure in (14) is then converted to the following LF representation with minimization of the restriction in the operator position: (15) [which x] was he willing to discuss [x claim [that John was asleep]]
Given the LF representation (15), in which he C-commands John, Condition C (correctly) interprets John as disjoint from he, thereby predicting the interpretation of (12a). Now consider (12b). Unlike (12a), (12b) can yield two distinct LF representations, depending on whether the introduction of the relative clause is cyclic or noncyclic. Let us examine the relevant aspects of these two derivations of (12b). Suppose that the relative clause is introduced into the derivation cyclically. Then CHL constructs the following intermediate structure: (16) [which claim [that John made]] was he willing to discuss t(which claim [that John made]) The structure in (16) is then converted to the following LF representation with minimization of the restriction in the operator position: (17) [which x] was he willing to discuss [x claim [that John made]]
Given the LF representation (17), in which he C-commands John, Condition C interprets John as disjoint from he. If (17) were the only LF representation of (12b), Condition C would force the disjoint interpretation of he and John, the wrong result. Now compare this cyclic derivation of (12b) with its noncyclic counterpart. Suppose that the relative clause is introduced into the
52
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
derivation noncyclically, specifically, after wh-movement of which claim. Then CHL constructs the following intermediate structure (in which the trace of the moved wh-phrase does not contain the relative clause which is introduced only after wh-movement): (18) [which claim [that John made]] was he willing to discuss t(which claim)
The structure in (18) is then converted to the following LF representation without minimization of the restriction in the operator position: (19) [which x, x claim [that John made]] was he willing to discuss x
Given the LF representation (19), in which he does not C-command John, Condition C (correctly) does not force a disjoint interpretation of he and John. In mapping (18) to (19), the restriction in the operator position cannot be minimized because that operation would induce an unrecoverable deletion of the relative clause. Consider the following LF representation resulting from such minimization: (20) [which x] was he willing to discuss [x claim]
In the LF representation (20), the relative clause is absent: a violation of the principle of recoverability of deletion.9 Given that "possibility" in the definition of the Preference Principle is understood relative to other principles such as the principle of recoverability of deletion, the minimization of the restriction, inducing a violation of the principle of recoverability of deletion, is impossible. Given that (12b) yields either the LF representation (17) or the LF representation (19), Condition C predicts the absence of reconstruction effects in (12b). The reconstruction asymmetry, exhibited by (12a-b), falls under the minimalist assumptions (incorporating (13)). Chomsky (1993:37) further proposes that reconstruction (i.e., a minimization of the restriction in the operator position) is essen-
Application of Interpretive Procedures
53
tially a reflex of the formation of operator-variable constructions. This proposal asserts that reconstruction holds only for operatorchains, but not for argument-chains. Consider (21a-b)10: (21) a. they seem to him [IP t to like John] b. [a the claim [that John was asleep]] seems to him [IP t to be correct]
In (21a) him cannot take John as antecedent. This disjoint interpretation of John and him is taken to be evidence that him C-commands into the embedded IP (containing John). Given that him C-commands John at LF, Condition C predicts the disjoint interpretation of him and John. By contrast, in (21b) him can take John as antecedent. This contrast immediately follows if reconstruction is not obligatory for argument-chains such as the one formed by the raising of a (containing John) to the matrix subject position in (21b).11 Given that John remains inside the matrix subject position (which is outside the C-command domain for him) at LF, Condition C does not force a disjoint interpretation of him and John.12 In this section, we reviewed Chomsky's (1993) analysis of binding relations, which captures the interpretations of (1), (5), (12a-b), and (21a-b). One crucial assumption adopted in his analysis is that LF is the only level providing a representation of syntactic relations, to which the interpretive version of binding theory applies. In the following section, we examine three problematic cases facing this single-level approach to syntactic relations.
2.2 Contradictory Requirements This section examines three problematic cases facing the singlelevel analysis of binding relations (discussed in the preceding section). Each case concerns syntactic relations that cannot be simultaneously represented at any single level. These syntactic relations thus pose a serious problem for the single-level approach.
54
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
2.2.1 Minimize and Do Not Minimize Brody (1995:134) presents the following case as a serious problem for the single-level approach to binding relations: (22) Mary wondered [which claim [that pictures of herself disturbed Bill]] he made t
In (22) herself takes Mary as antecedent, but he cannot take Bill as antecedent. This interpretation of (22) imposes contradictory requirements on the minimization of the restriction in the operator position. First notice that the obligatorily cyclic introduction of the complement clause precedes wh-movement; hence, under the copy theory of movement, (22) is assigned the following intermediate structure: (23) Mary wondered [which claim [that pictures of herself disturbed Bill]] he made t( which claim [that pictures of herself disturbed Bill])
The next step is to apply anaphor-movement. To capture the coreferential interpretation of Mary and herself, CHL must extract an anaphor (or part of it) out of the landing site of the moved whphrase. This application of anaphor-movement yields the following intermediate structure: (24) Mary self-wondered [which claim [that pictures of tself disturbed Bill]] he made t(which claim [that pictures of herself disturbed Bill]) Example (24) is then converted to the following LF representation without minimization of the restriction in the operator position 13 : (25) Mary self-wondered [which x,x claim [that pictures of tself disturbed Bill]] he made x Given the LF representation (25), in which the relevant part of herself is "sufficiently near" Mary, Condition A interprets herself as coreferential with Mary. But notice that in the LF representation
Application of Interpretive Procedures
55
(25) he does not C-command Bill. Thus Condition C does not force a disjoint interpretation of he and Bill, the wrong result. To capture the disjoint interpretation of he and Bill, the restriction in the operator position must be minimized. Such minimization would yield the following LF representation: (26) Mary self-wondered [which x] he made [x claim [that pictures of herself disturbed Bill]]
Given the LF representation (26), in which he C-commands Bill, Condition C (correctly) interprets Bill as disjoint from he. But notice that, in the LF representation (26), the raised anaphor is left without a 0-role: a violation of the 6-Criterion. Given that (22) yields the LF representation (25), the singlelevel analysis of binding relations fails to capture the interpretation of (22): herself takes Mary as antecedent, but he cannot take Bill as antecedent. In fact, no derivation can yield an LF representation of the interpretation of (22). This is because the single-level analysis imposes the following contradictory requirements on the minimization of the restriction in the operator position: (27) The restriction in the operator position must be minimized in order to force the disjoint interpretation of he and Bill, but it must not be minimized in order to allow (a licit 9-theoretic representation of) the coreferential interpretation of Mary and herself.
Simply put, there is no way to both "minimize and not minimize" the restriction in the operator position.
2.2.2 Cyclic and Noncyclic Rule-Application Lebeaux (1991, 1995) presents another problematic case facing the single-level approach to binding relations. Consider (28): (28) [which paper [that he gave to Mary]] did every student think that she would like t
56
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
In (28) he can be interpreted as a variable bound by every student, but she can take Mary as antecedent.14 This interpretation of (28) imposes contradictory requirements on the introduction of the relative clause. Given (13), (28) can yield two distinct LF representations, depending on whether the introduction of the relative clause is cyclic or noncyclic. Let us examine the relevant aspects of these two derivations of (28). Suppose that the relative clause is introduced into the derivation cyclically, necessarily, before wh-movement. Then CHL constructs the following intermediate structure: (29) [which paper [that he gave to Mary]] did every student think that she would like t(which paper [that he gave to Mary])
The structure in (29) is then converted to the following LF representation with minimization of the restriction in the operator position: (30) [which x] did every student think that she would like [x paper [that he gave to Mary]]
Given the LF representation (30), in which every student Ccommands he, he can be interpreted as a variable bound by every student, in accordance with the Condition on Bound-Variable Interpretation (see, among others, Chomsky 1976, Lasnik 1976, and May 1977)15: (31)
Condition on Bound-Variable Interpretation A pronoun P can be interpreted as a variable bound by QP only if QP C-commands P.
Given (31), he can be interpreted as a variable bound by every student in the LF representation (30). But notice that in (30) she Ccommands Mary; consequently, Condition C forces the disjoint interpretation of she and Mary, the wrong result. In mapping (29) to (30), the "violation-free" minimization of the restriction in the operator position is forced by the Preference Principle. Thus the cyclic derivation of (28) necessarily yields the
Application of Interpretive Procedures
57
LF representation (30), thereby failing to predict the interpretation of (28). Suppose that the relative clause is introduced into the derivation noncyclically, specifically, after wh-movement. Then CHL constructs the following intermediate structure (in which the trace of the moved wh-phrase does not contain the relative clause introduced after wh-movement): (32) [which paper [that he gave to Mary]] did every student think that she would like t(which picture)
Example (32) is then converted to the following LF representation without minimization of the restriction in the operator position: (33) [which x, x paper [that he gave to Mary]] did every student think that she would like x
Given the LF representation (33), in which she does not C-command Mary, Condition C does not force a disjoint interpretation of Mary and she, the right result. But notice that in (33) every student does not C-command he; consequently, the Condition on BoundVariable Interpretation prohibits he from being interpreted as a variable bound by every student, the wrong result. In mapping (32) to (33), the restriction in the operator position cannot be minimized because that operation would induce an unrecoverable deletion of the relative clause. Such minimization would yield the following LF representation: (34) [which x] did every student think that she would like [x paper]
In the LF representation (34), the relative clause is absent: a violation of the principle of recoverability of deletion. Thus the noncyclic derivation of (28) necessarily yields the LF representation (33), thereby failing to predict the interpretation of (28). Given that (28) yields either the LF representation (30) or the LF representation (33), the single-level analysis of binding relations fails to analyze the interpretation of (28): he can be interpreted
58
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
as a variable bound by every student, but she can take Mary as antecedent. In fact, no derivation can yield an LF representation of the interpretation of (28). This is because the single-level analysis imposes the following contradictory requirements on the introduction of the relative clause: (35) The relative clause must be introduced cyclically in order to allow he to be interpreted as a variable bound by every student, but it must be introduced noncyclically in order to allow the coreferential interpretation of Mary and she.
Simply put, there is no way for the relative clause to be introduced both "cyclically and noncyclically."16
2.2.3 C-Command and No C-Command Chomsky (1995:304) discusses a slightly different (but not unrelated) problem posed by raising constructions such as (2la), repeated in (36): (36) they seem to him [IP t to like John]
In (36) him cannot take John as antecedent. This disjoint interpretation of him and John is taken to be evidence that him C-commands into the embedded IP (containing John). Given that him C-commands John at LF, Condition C forces the disjoint interpretation of him and John. Chomsky further takes this disjoint interpretation of him and John to be evidence that him C-commands into the embedded IP (containing they) prior to the raising of they to the matrix subject position. Consider the following intermediate structure (to which the raising of they applies): (37) INFL seem to him [IP they to like John]
Given that him C-commands they in (37), the raising of they over him causes a serious problem for the theory of movement (presented in Chomsky 1995).
Application of Interpretive Procedures
59
Chomsky (1995:297), adopting the view that syntactic operations such as Move are driven by morphological necessity, interprets the movement of a to (a position in the Checking Domain of) K as K attracting a, instead of a moving to K.17 He then proposes that Move raises a to (a position in the Checking Domain of) K if K attracts a. Under this "attraction" view of movement, Chomsky (1995:311) formulates the "Shortest Movement" property as part of the definition of Attract.18 He calls it the Minimal Link Condition: (38)
Minimal Link Condition K attracts a only if there is no b, b closer to K than a, such that K attracts b.
The notion of closeness is understood in terms of C-command (Chomsky 1995:358):19 (39)
Closeness b is closer to K than a is if b C-commands a.
Chomsky's (1995) analysis of movement, incorporating the MLC and the notion of closeness, derives the central cases motivating Rizzi's (1990) Relativized Minimality, which include superraising (40)20: (40) *John seems [it is certain [t to be here]]
In (40), the raising of John to the matrix subject position violates the MLC because there is an attractable category closer to the matrix INFL. Consider the following intermediate structure (41) (to which the raising of John applies): (41) INFL seems [it is certain [John to be here]]
In (41), it C-commands John (and matches with the matrix INFL at least in the D-feature).21 Given that it is an attractable category closer to the matrix INFL than is John, the MLC prohibits the matrix
60
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
INFL from attracting John. Consequently, Move cannot raise John to the matrix subject position, the right result.22 Now compare (40) with (36), repeated below as (42): (42) they seem to him [IP t to like John] Recall that the disjoint interpretation of him and John (exhibited by (42)) is taken to be evidence that (1) him C-commands into the embedded IP (containing John) and (2) him C-commands into the embedded IP (containing they) prior to the raising of they to the matrix subject position. Consider (37) (to which the raising of they applies), repeated below as (43): (43) INFL seem to him [IP they to like John] In (43) him C-commands they (and matches with the matrix INFL at least in the D-feature). Given that him is an attractable category closer to the matrix INFL than is they, the MLC prohibits the matrix INFL from attracting they. Consequently, Move cannot raise they to the matrix subject position, the wrong result.23 The problem (posed by (42)) is that Chomsky's (1993) analysis of binding relations and Chomsky's (1995) analysis of movement impose the following contradictory requirements concerning the Ccommand relation between him and the embedded IP: (44) him must C-command into the embedded IP (containing John) to force the disjoint interpretation of him and John, but it must not Ccommand into the embedded IP (containing they) because if it did, then the matrix INFL could not attract they (him being a closer attractable category). Simply put, there is no way for him to "C-command and not Ccommand" into the embedded IP.24 In this section, the three problematic cases (22), (28), and (36) were examined. Each case exhibited syntactic relations that cannot be simultaneously represented at any single level, a paradox confronting the single-level approach to syntactic relations. We are thus left with the question: How does the Minimalist Program encode such syntactic relations?
Application of Interpretive Procedures
61
2.3 A Derivational Model of Syntax This section addresses the question of how the Minimalist Program encodes syntactic relations (including the ones that cannot be simultaneously represented at any single level). To provide an answer to this question, we make use of the derivational analysis of syntactic relations, in particular, C-command, presented in chapter 1. The Minimalist Program assigns a derivational approach greater prominence by adopting the derivational view of the syntactic component: CHL selects lexical items from a Numeration and performs a structure-building procedure by successive applications of concatenation. Recall that concatenation (of categories) is a property, shared by Merge and Move (Chomsky 1994, 1995; Kitahara 1997): (45)
Merge Applied to two objects a and b, Merge forms the new object K by concatenating a and b.
(46)
Move Applied to the category S with K and a, Move forms the new object S' by concatenating a and K. This operation, if noncyclic, replaces K in S by L = { g , {a, K}}.25
Taking the concatenation operation to be a necessary (and irreducible) aspect of CHL, we follow the hypothesis put forth in chapter 1 that concatenation establishes syntactic relations. Under this hypothesis, the arguably most fundamental syntactic relation, namely, the C-command relation, is captured by the derivational definition, repeated in (45): (47)
Derivational C-Command X C-commands all and only the terms of the category Y with which X was concatenated by Merge or Move in the course of the derivation.
62
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
Concerning the definition of term, instead of Chomsky's (1994, 1995) "top-down" definition, here we adopt the "bottom-up" definition (understood in terms of concatenation), given in (46): (48)
Term i. L is a term of K iff L = K, or ii. L is a term of the category concatenated to form K.
Within this derivational analysis, syntactically significant relations such as C-command relations are predictable from the way they are created by CHL; LF is superfluous at least in determining them. Now recall that (22), (28), and (36) each exhibit the syntactic relations that cannot be simultaneously represented at any single level. Given this apparent paradox confronting the single-level approach to syntactic relations, we depart from the minimalist conception of linguistic levels and propose the following form of the derivational model of syntax: CHL selects lexical items from a Numeration and performs a structure-building procedure that directly provides information (including syntactic relations) to the interface systems. Under this model, the application of the structure-building operations (i.e., rules such as Merge and Move) creates syntactic relations derivationally; furthermore, upon creation, those relations enter into the interpretive procedures without mediation of linguistic levels. Dispensing with the LF mediation requirement, the interpretive version of binding theory can apply within the derivational process itself. We assume such derivational application of interpretive procedures to be constrained as follows (Belletti and Rizzi 1988, Lebeaux 1988, 1991, 1995)26: (49) The application of "disjoint" interpretive procedures occurs at every point of the derivation, whereas the application of "anaphoric" interpretive procedures occurs at any single point of the derivation.
In the following section, we argue that syntactic relations (including the ones exhibited by (22), (28), and (36)) are established derivationally and sent to the interpretive procedures, in accordance with (49).27
Application of Interpretive Procedures
63
2.4 The Derivation of Binding Relations This section proposes a derivational analysis of binding relations (examined in sections 2.1 and 2.2). Under the proposed analysis, CHL determines all the syntactic relations required for the interpretations of (1), (5), (12a-b), (22), (28), and (36) derivationally. Furthermore, in accordance with (49), CHL provides such syntactic relations directly to the interpretive procedures, as the derivation proceeds. That is, the proposed analysis resolves the paradoxical problems noted in section 2.2 by rejecting the single-level approach in which binding/C-command relations are read off of an LF representation. Instead, LF level interpretation follows from the derivational process itself (irrespective of the final form of LF representation). 2.4.1 Reconstruction Asymmetries Recall the reconstruction asymmetry, exhibited by (1) and (5), repeated in (50a-b) (Chomsky 1993): (50) a. John wondered [which picture of Bill] he saw t b. John wondered [which picture of himself] Bill saw t In (50a) he cannot take Bill as antecedent, whereas in (50b) himself can take either Bill or John as antecedent; reconstruction is obligatory in (50a), but not in (50b). Given (13), the introduction of the complement must be cyclic. Let us examine the cyclic derivations of (50a-b). First consider (50a). At some point of the derivation, CHL constructs (51) by concatenating he with the category of which Bill is a term: (51) he saw [which picture of Bill] This application of concatenation establishes a number of syntactic relations, including the relation: he C-commands Bill. Given (49), Condition C must apply at every point of the cyclic derivation of (50a); hence, Condition C, applying at this point of the derivation,
64
ADERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
interprets Bill as disjoint from he.28 Thus in the cyclic derivation of (50a), the disjoint interpretation of he and Bill results from the application of Condition C to (51). Now consider (50b). Given (49), Condition A needs to be satisfied at only one point of the cyclic derivation of (50b). Suppose that Condition A applies before wh-movement, say, when CHL constructs (52) by concatenating Bill with the category of which himself is a term: (52) Bill saw [which picture of himself]
This application of concatenation establishes a number of syntactic relations including the relation: Bill C-commands himself. Hence, Condition A, applying at this point of the derivation, interprets himself as coreferential with Bill. By contrast, suppose that Condition A applies after wh-movement, say, when CHL constructs (53) by concatenating John with the category of which himself is a term: (53) John wonders [which picture of himself] Bill saw t
This application of concatenation in turn establishes a number of syntactic relations including the relation: John C-commands himself. Hence, Condition A, applying at this point of the derivation, interprets himself as coreferential with John. Thus the cyclic derivation of (50b) allows himself to take either Bill or John as antecedent; these two distinct coreferential interpretations result from the two distinct applications of Condition A: one to (52), the other to (53). This derivational analysis of (50a-b) readily extends to the reconstruction asymmetry, exhibited by (12a—b), repeated in (54a— b) (Chomsky 1993): (54) a. [which claim [that John was asleep]] was he willing to discuss t b. [which claim [that John made]] was he willing to discuss t
In (54a) he cannot take John as antecedent, whereas in (54b) he can take John as antecedent; reconstruction is obligatory in (54a), but not in (54b). Given (13), the introduction of the complement clause must be cyclic, whereas the introduction of the relative clause can
Application of Interpretive Procedures
65
be cyclic or noncyclic. Let us examine the cyclic derivation of (54a) and the noncyclic derivation of (54b). First consider (54a). At some point of the derivation, CHL constructs (55) by concatenating he with the category of which John is a term: (55) he was willing to discuss which claim [that John was asleep] This application of concatenation establishes a number of syntactic relations including the relation: he C-commands John. Given (49), Condition C must apply at every point of the cyclic derivation of (54a); hence, Condition C, applying at this point of the derivation, interprets John as disjoint from he. Thus in the cyclic derivation of (54a), the disjoint interpretation of he and John results from the application of Condition C to (55). Now consider the noncyclic derivation of (54b), in which he is introduced before the relative clause containing John. At some point of the derivation, CHL constructs (56) by concatenating he with the category of which John is not (yet) a term: (56) he was willing to discuss [which claim] This application of concatenation establishes a number of syntactic relations, but it does not establish any relation between he and nonexisting categories such as John. Suppose, as the next step, CHL introduces the relative clause containing John noncyclically, yielding (57): (57) he was willing to discuss [which claim [that John made]] Crucially, this application of concatenation establishes no syntactic relation between he and any term of the relative clause containing John. Recall that under the derivational definition the C-command domain for he was determined when he was introduced into the derivation (by the rule-application in (56)). At that point of the derivation, however, John was not (yet) a term of the category concatenated with he; hence, no relation was established between he and John. Thus it follows from the derivational definition that throughout the derivation he does not C-command John; hence,
66
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
Condition C does not force a disjoint interpretation of he and John. In the noncyclic derivation of (54b), therefore, the coreferential interpretation of John and he is possible if the introduction of he precedes the introduction of the relative clause containing John.29 Notice that, within the derivational analysis of binding relations, the reconstruction asymmetries, exhibited by (50a-b) and (54a-b), are captured with no appeal to the copy theory of movement, the Preference Principle, or the LF movement approach to anaphora.30 2.4.2 Conditions A and C Recall (22), which involves Conditions A and C, repeated in (58) (Brody 1995): (58) Mary wondered [which claim [that pictures of herself disturbed Bill]] he made t
In (58) herself takes Mary as antecedent, but he cannot take Bill as antecedent. Given (13), the complement clause must be introduced cyclically. Consider the cyclic derivation of (58). At some point of the derivation, CHL constructs (59) by concatenating he with the category of which Bill is a term: (59) he made [which claim [that pictures of herself disturbed Bill]]
This application of concatenation again establishes a number of syntactic relations, including the relation: he C-commands Bill. Given (49), Condition C must apply at every point of the cyclic derivation of (58); hence, Condition C, applying at this point of the derivation, interprets Bill as disjoint from he. Later in the derivation, CHL constructs (60) by concatenating Mary and the category of which herself is a term: (60) Mary wondered [which claim [that pictures of herself disturbed Bill]] he made t
Application of Interpretive Procedures
67
This application of concatenation establishes a number of syntactic relations including the relation: Mary C-commands herself. Given that Condition A can apply at any point of the cyclic derivation of (58), it can apply at this point of the derivation, interpreting herself as coreferential with Mary. Thus the cyclic derivation of (58) prohibits he from taking Bill as antecedent, but it allows herself to take Mary as antecedent. The disjoint interpretation of he and Bill results from the application of Condition C to (59), and the coreferential interpretation of Mary and herself results from the application of Condition A to (60).
2.4.3 Bound-Variable Interpretation and Condition C Recall (28), which involves the Condition on Bound-Variable Interpretation and Condition C, repeated in (61) (Lebeaux 1991, 1995): (61) [which paper [that he gave to Mary]] did every student think that she would like t
In (61) he can be interpreted as a variable bound by every student, but she can take Mary as antecedent. Given (13), the introduction of the relative clause can be cyclic or noncyclic. Let us examine a noncyclic derivation of (61), in which the introduction of the relative clause (containing both he and Mary) occurs between the introduction of the embedded subject she and the introduction of the matrix subject every student. At some point of the derivation, CHL constructs (62) by concatenating she and the category of which the relative clause is not (yet) a term: (62) she would like [which paper]
This application of concatenation establishes no relation between she and nonexisting categories such as Mary. The derivation later reaches (63):
68
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
(63) think that she would like [which paper]
Suppose, as the next step, CHL introduces the relative clause noncyclically, yielding (64): (64) think that she would like [which paper [that he gave to Mary]]
This application of concatenation establishes no relation between she and any term of the relative clause containing Mary. Recall that under the derivational definition the C-command domain for she was determined when she was introduced into the derivation (by the rule-application in (62)). Thus it follows from the derivational definition that throughout the derivation she does not C-command Mary. Hence, Condition C does not force a disjoint interpretation of she and Mary. CHL then constructs (65) by concatenating every student and the category of which the relative clause containing he is a term: (65) every student think that she would like [which paper [that he gave to Mary]]
This application of concatenation establishes a number of syntactic relations including the relation: every student C-commands he. Assuming that the Condition on Bound-Variable Interpretation is part of "anaphoric" interpretive procedures, it can apply at any point of the noncyclic derivation of (61). Hence, the Condition on Bound-Variable Interpretation, applying at this point of the derivation, allows he to be interpreted as a variable bound by every student.31 Thus the noncyclic derivation of (61) allows he to be interpreted as a variable bound by every student, but coreference between Mary and she is not blocked. That is, the coreferential interpretation of Mary and she is possible if the introduction of the relative clause containing Mary follows the introduction of she, and the bound-variable interpretation of he is possible if the introduction of the relative clause containing he precedes the introduction of every student.32
Application of Interpretive Procedures
69
2.4.4 The Minimal Link Condition and Condition C Recall (36), which involves the Minimal Link Condition and Condition C, repeated in (66) (Chomsky 1995): (64) they seem to him [IP t to like John]
In (66) him cannot take John as antecedent. This disjoint interpretation of him and John suggests that him C-commands into the embedded IP, but the "violation-free" (i.e., MLC-satisfying) raising of they to the matrix subject position suggests that him does not C-command into the embedded IP. This apparent paradox disappears if him C-commands into the embedded IP only after the raising of they. Given this derivational view of the conflicting Ccommand relations, let us examine the relevant aspects of the derivation of (66). At some point of the derivation, CHL constructs to him (i.e., a projection of the head to) by concatenating to and him. The C-command domain for him is defined at this point of the derivation; him C-commands only to. The derivation later reaches (67): (67) INFL seem [to him] [IP they to like John]
Suppose, as the next step, CHL raises they to the matrix subject position. Given that him C-commands no category other than to, they counts as the closest attractable category to the matrix INFL. The matrix INFL attracts they; consequently, Move raises they to the matrix subject position, yielding (68): (68) they INFL seem [to him] [IP t to like John]
Thus the derivation of (66), generating (68), induces no violation of the MLC; (66) poses no problem for Chomsky's (1995) analysis of movement.33 Now consider the disjoint interpretation of him and John. Assuming that Condition C is responsible for such disjoint reference, we must determine exactly how CHL establishes the relation: him C-commands John.
70
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
Begin by considering the features of to in to him. At minimum, to bears its phonetic features and the Case-feature (checking him). Let us proceed with this minimum assumption. Now notice that (1) phonetic features are stripped away by Spell-Out, and (2) uninterpretable features such as Case-features are eliminated when checked (Chomsky 1995). Given these minimalist assumptions, all the features of to will be eliminated after the stripping of phonetic features and the checking of Case-features. But then what will happen to to him? Here, let us assume that to him undergoes the following elimination process (Kitahara 1996b, 1997)34: (69) [to him]
him
This elimination process (in effect) dissociates him from the derivation by putting him out of any C-command relations. Let us further assume that this otherwise "stranded" category him must be reintroduced into the derivation by concatenating him and the category that was concatenated with to him.35 Given these assumptions, upon the stripping of phonetic features and the checking of Case-features, CHL will construct (70) by concatenating him and the category of which John is a term: (70) they INFL seem [him [IP t to like John]]
This application of concatenation establishes a number of syntactic relations, including the relation: him C-commands John. Given (49), Condition C must apply at every point of the derivation of (66); hence, Condition C, applying at this point of the derivation, interprets John as disjoint from him.36 Thus the derivation of (66) allows the matrix INFL to attract they, while prohibiting him from taking John as antecedent; the disjoint interpretation of him and John results from the application of Condition C to (70).37 In this section, in addition to the derivational analysis of reconstruction asymmetries (see (50a—b) and (54a—b)), we provided a derivational solution to the paradoxical problems confronting the single-level analysis of binding/C-command relations (see (58), (61), and (66)).38
Application of Interpretive Procedures
71
2.5 The Derivation of Scope Relations This section extends the derivational analysis to certain scope relations. First consider the following example, which exhibits two distinct scope interpretations (Lebeaux 1995):39 (71) Two women seem [IP t to dance with every senator] two women >< every senator
Example (71) can be interpreted as "there are two women who seem to dance with every senator" (i.e., two women takes scope over every senator) or as "for every senator, there are two women who seem to dance with that senator" (i.e., every senator takes scope over two women).40 To capture this scope ambiguity, following the insight of Lebeaux's derivational analysis, we appeal to the following derivational determination of scope: (72) The scope of a is determined at any point of the derivation.
Given (72), the scope of two women can be determined either before or after the raising of two women to the matrix subject position. Given such scope determination, let us examine the derivation of (71). At some point of the derivation, CHL constructs the embedded IP, given in (71): (73) [IP two women to dance with every senator]
Suppose that the scope of two women is determined at this point of the derivation. Then the scope of two women is the embedded clause.41 Further suppose that the scope of every senator is determined as the embedded clause, perhaps, after the application of Quantifier Raising (May 1977, 1985). Then either two women or every senator may take scope over the other (yielding the scope ambiguity).42 The derivation later reaches the matrix IP, given in (74): (74) [IP two women seem [IP t to dance with every senator]]
72
ADERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
Now suppose that the scope of two women is instead determined at this point of the derivation. Then two women, occupying the matrix subject position, obligatorily takes scope over every senator in the embedded IP.43 Thus given (72), the scope of two women can be determined either before or after the raising of two women to the matrix subject position, and these two distinct applications of scope determination are responsible for the scope ambiguity, exhibited by (71). Now compare (71) with (75) (in which each other takes two women as antecedent) (Lebeaux 1995): (75) Two women seem to each other [IP t to dance with every senator] two women > every senator
Example (75) can be interpreted only as "there are two women who seem to each other to dance with every senator" (i.e., two women takes scope over every senator).44 To capture the scope asymmetry, exhibited by (71) and (75), we propose the following minimum modification to the formulation of Condition A45: (76) A: If a is an anaphor, interpret it as coreferential with a category taking scope over a in D.
Given this minimum modification, let us examine the derivation of (75). At some point of the derivation, CHL constructs the embedded IP, given in (77): (77) [IP two women to dance with every senator]
Suppose that the scope of two women is determined at this point of the derivation. Then the derivation may yield the scope ambiguity: either two women or every senator may take scope over the other. But notice that this application of scope determination would prohibit two women from taking scope over each other (which is not yet introduced into the derivation). We assume that a category has one and only one scope. Thus if the scope of two women were determined at this point of the derivation, Condition A would fail to interpret each other as coreferential with two women. Now compare (77) with the matrix IP, given in (78):
Application of Interpretive Procedures
73
(78) [IP two women seem to each other [IP t to dance with every senator]]
Suppose that the scope of two women is instead determined at this point of the derivation. Then two women, occupying the matrix subject position, obligatorily takes scope over both every senator in the embedded IP and each other in the matrix IP. Hence, Condition A, applying at this point of the derivation, interprets each other as coreferential with two women. Thus each other can take two women as antecedent only if the scope of two women is determined after the raising of two women to the matrix subject position, and this required timing of scope determination is responsible for the obligatory wide-scope interpretation of two women, exhibited by (75).46 In this section, we presented a derivational analysis of the scope asymmetry, exhibited by (71) and (75). Needless to say, further research is necessary to provide a more comprehensive understanding of the exact algorithm of scope determination. 2.6 Restrictions on Noncyclic Concatenation This section examines two cases exhibiting unexpected restrictions on noncyclic concatenation. We argue that these restrictions are, in fact, predictable, given the independently motivated "linear order" analysis (Chomsky 1994, 1995, Kayne 1994). First consider the following case, which involves Condition C: (79) he was willing to discuss the claim [that John made]
In (79) he cannot take John as antecedent. Given (13), the introduction of the relative clause can be cyclic or noncyclic. Thus in principle, the relative clause can be introduced either before or after he. But notice, in order to account for the disjoint interpretation of he and John, he must C-command John. To establish this Ccommand relation, the introduction of the relative clause containing John must precede the introduction of he. If the relative clause containing John were introduced noncyclically, specifically, after he, no C-command relation would be established between he and any term of the relative clause containing John. That is, under the
74
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
derivational definition, the C-command domain for he would be determined before the introduction of the relative clause containing John. Consequently, Condition C would not force a disjoint interpretation of he and John, the wrong result. Thus, for some independent reason, the introduction of the relative clause must be cyclic in the derivation of (79). The problem facing us is that the independently motivated (13) allows the noncyclic derivation of (79), which bleeds the application of Condition C and thereby incorrectly predicts that coreference is allowed. Here, we argue that the linear order analysis (presented in Chomsky 1994, 1995, Kayne 1994) ensures the cyclic introduction of the relative clause in the derivation of (79). Kayne (1994) proposes that linear order is universally determined by structural hierarchy by means of the LCA, an analysis later elaborated by Chomsky (1994, 1995).47 The central aspect of this linear order analysis is that "asymmetrical C-command imposes a linear order of terminal elements; any category that cannot be totally ordered by the LCA is barred" (Chomsky 1995:335).48 Given the linear order analysis, let us return to the noncyclic derivation of (79), in which the relative clause containing John is introduced after he. Crucially, this noncyclic introduction of the relative clause would establish no C-command relation between he and any term of the relative clause containing John. Notice that under the derivational definition the C-command domain for he would be determined at its introduction into the derivation which would occur before this noncyclic introduction of the relative clause. The absence of those C-command relations entails that no linear order would be determined between he and any term of the relative clause: a violation of the LCA. Thus, under the linear order analysis (adopting the derivational definition of C-command), the LCA restricts the application of noncyclic concatenation. A violation of the LCA would result if the introduction of the relative clause were noncyclic in the derivation of (79). Recall that the cyclic derivation of (77) establishes a Ccommand relation between he and every term of the relative clause containing John. Consequently, Condition C interprets John as disjoint from he, thereby predicting the interpretation of (79).49 Let us further elaborate this linear order analysis. Recall that the introduction of the relative clause can be cyclic or noncyclic in
Application of Interpretive Procedures
75
the derivation of (54b), whereas the introduction of the relative clause must be cyclic in the derivation of (79). One apparent difference between these two derivations is that wh-movement occurs in the former, but not in the latter. This difference allows us to formulate the following generalization: (80) a must be introduced cyclically if no subsequent movement operation affects a category containing a.
Example (80) immediately follows if the LCA ignores phonetically null elements such as traces left by movement (Chomsky 1995:340).50 That is, the noncyclic introduction of a can circumvent a violation of the LCA if this occurrence of a becomes phonetically null prior to the application of the LCA. Now return to the derivations of (54b) and (79). We can state the crucial difference between them as follows: wh-movement renders the relative clause phonetically null in the derivation of (54b), but not in the derivation of (79). Therefore, the LCA forces the cyclic introduction of the relative clause in the derivation of (79) (but not in the derivation of (54b)). This analysis is further supported by the following case, which poses a long-standing problem for the Minimalist Program (Chomsky 1993, 1994, 1995)51: (81) *who was [a a picture of twh] taken ta by Bill
Chomsky (1995:328) states that "[(81)] is a Condition on Extraction Domain (CED) violation in Huang's (1982) sense if passive precedes wh-movement, but it is derivable with no violation (incorrectly) if the operations apply in countercyclic order, with passive following wh-movement." Chomsky (1995:328) appeals to the descriptive property of feature strength to ensure the cyclic order.52 Under the current assumptions, however, the LCA forces this cyclic order, namely, passive preceding wh-movement. If passive followed wh-movement, no C-command relation would be established between who (occupying the specifier of C) and any term of a (occupying the subject position). Notice that under the derivational definition the C-command domain for who (occupying the specifier of C) would be determined when who is raised to the
76
ADERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
specifier of C, prior to the raising a to the subject position. Consequently, no linear order would be determined between who (occupying the specifier of C) and any term of a (occupying the subject position): a violation of the LC A (Groat 1995b, Kawashima and Kitahara 1995).53 In this section, we demonstrated that the linear order analysis (adopting the derivational definition of C-command) predicts the cyclic derivations of (79) and (81); the LCA restricts the application of noncyclic concatenation. 2.7 Eliminating Derivational Marking Mechanisms This section briefly compares the proposed derivational analysis with Lebeaux's (1988, 1991, 1995) derivational analysis (adopting the single-level approach to syntactic relations). One crucial difference between these two derivational analyses is that, unlike the proposed derivational analysis, Lebeaux's derivational analysis necessarily invokes some form of derivational marking mechanism, similar to Lasnik and Saito' s (1984, 1992) y-marking mechanism which marks syntactic relations existing only of intermediate points of a derivation and retains such information for LF level interpretation.54 These two derivational analyses are no longer competing options if the minimalist assumptions are taken seriously. Notice that the implementation of derivational marking mechanism violates one of the core minimalist assumptions, namely, the Condition of Inclusiveness (Chomsky 1995:228): (82)
Condition of Inclusiveness No new objects are added in the course of computation apart from rearrangement of lexical properties.
Given (82), CHL cannot create and leave behind (by ad hoc annotation) any information to be used at the LF level (e.g., star * encoding a derivational violation of Condition C, which needs to be retained for LF level interpretation) (see Chomsky 1995:228, fn 7). Thus the Condition of Inclusiveness selects the proposed derivational analysis of syntactic relations, in which the LF mediation
Application of Interpretive Procedures
77
requirement is eliminated along with its supplementary derivational marking mechanism.55 2.8 Summary The central question we addressed in this chapter was: "How are syntactic relations (including the ones that cannot be simultaneously represented at any single level) encoded by CHL?" To provide an answer to this question, we advanced the derivational approach to syntactic relations with the proposal that the derivational process not only determines syntactic relations but also provides such information directly to the interface systems. That is, the interpretive procedures directly interpret the rule-application (not the resulting representation). Under this derivational model of syntax, we demonstrated that the derivational analysis of syntactic relations captures various binding effects (including ones posing a paradoxical problem for the single-level approach to syntactic relations), and does so in accordance with the Condition of Inclusiveness.
Notes 1. See Epstein (1990) for relevant discussion of redundant exclusion. 2. In this chapter, we develop a derivational approach to syntactic relations. In chapter 5, we suggest a parallel derivational approach to linear relations. 3. The interpretive version of binding theory is a return to earlier approaches (see, among others, Chomsky 1981). This theory incorporates the representational definition of C-command (Reinhart 1976, 1979). For detailed discussions of the binding conditions under minimalist assumptions, see also Abe (1993), Heycock (1995), and Freidin (1996). 4. Although the Preference Principle is empirically motivated, it seems far from conceptual necessity. In section 2.4, we propose a derivational analysis dispensing with the Preference Principle. 5. Chomsky (1993:43) suggests that Condition A may be dispensable if the LF movement approach to anaphora is correct and the effects of Condition A follow from the theory of movement (though he notes that further discussion is necessary at many points). See also Epstein (1986) for discussion of LF anaphor-movement and A-chain constraints. 6. Note that the tail tself of the anaphor-chain (which is associated with a 0-role)
78
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
is deleted by the minimization of the restriction in the operator position. 7. Howard Lasnik (personal communication) pointed out to us an empirical question concerning the status of asymmetries such as the one exhibited by (12ab). See also Abe (1993) and Watanabe (1995) for relevant discussion. As Chomsky (1993:fn 45) notes, it is not clear whether such asymmetries are to be understood as tendencies or sharp distinctions obscured by performance factors. For expository purposes, however, we proceed with the latter possibility. 8. Lebeaux (1988) argues that (13) is deducible from the Projection Principle, which requires arguments, but not adjuncts, to be present at D-Structure. For recent discussion of topics related to (13), see Hasegawa (1996). See also note 49 below. 9. Note that the relative clause is deleted by the minimization of the restriction in the operator position. 10. IP is used when details are irrelevant to our discussion. For discussion of the articulated IP structure, see Pollock (1989) and, among others, Chomsky (1991, 1993, 1994, 1995), Chomsky and Lasnik (1993). 11. Note that the cyclic introduction of the complement clause (containing John) necessarily precedes the raising of a; consequently, the raised a with its complement clause (containing John) occupies the matrix subject position. 12. Lasnik (1993) points out that the absence of minimization is what the Preference Principle entails, given that the Preference Principle is irrelevant when there is no operator-chain. 13. Recall that minimization in such cases is impossible since the result would break the anaphor-chain. 14. Howard Lasnik (personal communication) pointed out to us that the following example may serve better for those who might find it hard to establish variable binding in (28) (because which paper is specific): (i)
[which of [the papers [that he gave to Mary]]] did every student think that she would like t
15. Under the minimalist conception of linguistic levels, the Condition on Bound-Variable Interpretation, like the interpretive version of binding theory, applies solely at the LF level. For recent discussion of bound-variable interpretation under minimalist assumptions, see Abe (1993). 16. The Minimalist Program takes each instance of movement to be triggered by the necessity of feature-checking (Chomsky 1993, 1994, 1995). We assume such movement to be one-step movement directly to a position in which featurechecking takes place, thereby creating no (non-feature-driven) intermediate landing sites (Abe 1993, Kitahara 1994). But even if there is successive-cyclic movement, this would not appear to resolve the contradiction noted here. Lebeaux (1991, 1995), adopting successive-cyclic wh-movement, proposes that intermediate landing sites (e.g. specifier of C bearing no wh-feature) are possible reconstruction sites (see also Barss 1986). Lebeaux's proposal, however, seems inconsistent with Chomsky's (1993:37) interpretation of reconstruction as a reflex of the formation of operator-variable constructions. In section 2.4, we propose a derivational analysis which makes no use of intermediate landing sites. 17. For the definition of Checking Domain, see Chomsky (1995:299). In chapter 3, we will simplify the mechanism of checking.
Application of Interpretive Procedures
79
18. In addition, Chomsky (1995) formulates the "C-command" property between the head and the tail of the chain and the "morphologically driven" property of movement as part of the definition of Attract. For relevant discussion of the latter property, see also Lasnik (1995). 19. The notion of closeness has been interpreted in terms of C-command and equidistance. See, among others, Chomsky (1993, 1994, 1995), Holmberg (1986), Jonas and Bobaljik (1993), Jonas (1995), and Thrainsson (1993). Here, we ignore equidistance as it is irrelevant to our discussion. 20. For detailed discussion of Rizzi's (1990) relativized minimality analysis, see, among others, Frampton (1991), and Chomsky and Lasnik (1993). 21. See Groat (1995a) for relevant discussion of the features of it. 22. See Kitahara (1997) for an extension of Chomsky's (1995) MLC analysis to other movement phenomena involving violations of the Superiority Condition (Chomsky 1973,Pesetsky 1982) and the Proper Binding Condition (Fiengo 1977, May 1977, Saito 1989). 23. Chomsky (1995:305) discusses raising constructions in French, which arguably accord with the expectation of the MLC. But he also notes their unclear status (Chomsky 1995:fn 79). Here, we limit our discussion to raising constructions in English. 24. Notice that (44) has a status slightly different from (27) or (35). Given that the interpretive version of binding theory applies solely at the LF level, whereas the MLC constrains movement applying in the course of the derivation, it is possible that him C-commands into the embedded IP at LF, but not prior to LF. This resolves the paradox noted here. That is, under this crucially derivational approach to the C-command relation in question, the MLC allows the matrix INFL to attract they prior to LF, and, at the same time, Condition C, applying solely at the LF level, interprets John as disjoint from him at LF. In section 2.4, we advance this derivational analysis. 25. In chapter 5 we will call into question the need for the additional operation internal to noncyclic Move that replaces K in £ by L. 26. The historical antecedents of the derivational interpretive procedures can be found in the generative syntactic work of the early 1970s (see, among others, Jackendoff 1972 and Lasnik 1972). 27. Assuming (49), Lebeaux (1988, 1991, 1995) advances a derivational analysis of binding relations. His analysis, however, differs from our analysis. He retains the single-level approach to syntactic relations by invoking some form of derivational marking mechanism (which derivationally creates and leaves behind information to be used later in LF). In section 2.7, we argue that all such derivational marking mechanisms violate one of the core minimalist assumptions. 28. Note that there is no need to mark anything in some ad hoc annotation (to be interpreted later at the LF level); the relevant information is sent directly to the interpretive procedures (which interpret the output of the rule-application). 29. As long as he is introduced before the relative clause containing John, whether the introduction of the relative clause containing John precedes or follows wh-movement does not affect the analysis presented here. By contrast, under the representational definition of C-command, the relative clause must be introduced after wh-movement of which claim, in order to account for the absence of reconstruction effects in (54b).
80
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
30. This analysis suggests that the copy theory of movement, the Preference Principle, and the LF movement approach to anaphora may be eliminable from the Minimalist Program, provided that they are essentially motivated by reconstruction asymmetries such as the ones exhibited by (50a-b) or by (54a-b). But notice that such elimination renders a trace completely invisible to subsequent operations. Needless to say, this issue requires further investigation. For relevant discussion of a trace left by the raising of a subject, see note 36. 31. Note that there is a rather deep parallelism between the proposed derivational analysis of bound-variable interpretation and its historical antecedents such as Lasnik (1976). Under these analyses, bound-variable interpretation is determined cyclically in the overt syntax. 32. The proposed derivational analysis is further supported by (i), in which every student and she switch their positions (Lebeaux 1991, 1995): (i)
[which paper [that he gave to Mary]] did she think that every student would like t
In (i) she cannot take Mary as antecedent if he is interpreted as a variable bound by every student. Consider the following three aspects of the derivation of (i): (1) to allow the bound-variable interpretation of he, every student must C-command he; (2) to establish this C-command relation, the introduction of the relative clause containing he must precede the introduction of every student; and (3) the introduction of the embedded subject every student must precede the introduction of the matrix subject she. Given these three aspects, it follows that the introduction of the relative clause containing Mary precedes the introduction of she; consequently, she C-commands Mary, and Condition C interprets Mary as disjoint form she. Thus the bound-variable interpretation of he entails the disjoint interpretation of she and Mary, the right result. 33. Note that the proposed solution seems to imply that there is no raising in examples like (i) (where him cannot take John as antecedent): (i)
they strike him as angry at John
This implication is consistent with the absence of clear evidence that (i) involves raising (i.e., (i) cannot have the expletive there or an idiom chunk in the subject position). We would like to thank Howard Lasnik and Mamoru Saito for bringing our attention to this issue. 34. For relevant discussion of a similar covert restructuring process under the Agr-based framework, see Chomsky (1991), Epstein (1993), and Ferguson (1994). 35. The necessity of such reintroduction is the residue of Kitahara's (1994) Single Representation Requirement (which requires the root to reflexively dominate every category) or Collins's (1995, 1997) Integration (which requires that every category (except the root) must be contained in another category). We leave for future research how such residue can be recast under current assumptions; see also chapter 5. 36. This analysis may extend to the following "unexpected" binding effects (see, among others, Larson 1988, 1990, Jackendoff 1990, Kuno and Takami 1993, and Ferguson 1994):
Application of Interpretive Procedures (i)
81
a. I talked [g to [a John]] about [b himself] b. I received a letter [g from [a John]] about [b him] c. I spoke [g to [a him]] about [b John's] mother
Example (ia) exhibits the coreferential interpretation of John and himself, whereas (ib-c) exhibit the disjoint interpretation of him and John. Each interpretation of (ia-c) is captured if the head of g bears only its phonetic features and the Casefeature (checking a). Under this assumption, all the features of the head of g are eliminated after the stripping of phonetic features and the checking of Casefeatures. Thus CHL reintroduces the otherwise "stranded" category a into the derivation; consequently, a enters into a C-command relation with b. The interpretive version of binding theory, therefore, predicts each interpretation of (ia-c). See Ferguson (1994) for a unified Agr-based analysis of (ia-c). Also see Pesetsky (1995) for a different approach to the problem of binding out of PPs. 37. Recall the absence of reconstruction effects in (21b), repeated in (i): (i)
[a the claim [that John was asleep] seems to him [IP t to be correct]
In (i) him can take John as antecedent. Suppose that all the features of to in to him are eliminated. Then him enters into a C-command relation with the trace t of a (as argued here). Now notice that if t contained the formal features of John (relevant for the binding conditions), then Condition C would force the disjoint interpretation of John and him, the wrong result. One possible way to circumvent this problem is to assume that formal features of trace are deleted (hence erased) if they are not necessary for subsequent operations (Chomsky 1995:304). Under this assumption, by the time him Ccommands t, t contains no formal feature of John (since not necessary for subsequent operations); hence, him does not C-command any formal features of John. Consequently, Condition C does not prohibit him from taking John as antecedent. This analysis is (arguably) supported by cases such as (ii) (Belletti and Rizzi 1988); (ii) [a replicants of themselves] seemed to the boys [IP t to be ugly] In (ii) themselves can take the boys as antecedent. Under the current assumptions, after all the features of to in to the boys are eliminated, the boys enters into a Ccommand relation with the trace t of a. Now notice that, unlike (i), (ii) requires t to contain the formal features of themselves. This asymmetry immediately follows if the formal features of themselves are necessary for subsequent operations (such as LF anaphor-movement). 38. Chomsky (1995:223) presents the following derivational view of CHL: (i)
Viewed derivationally, computation typically involves simple steps expressible in terms of natural relations and properties, with the context that makes them natural "wiped out" by later operations, hence not visible in the representations to which the derivation converges.
82
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
The example (i) is supported by the cases examined in this section. Each case requires some syntactic relation to be present (or absent), and we have shown that the presence (or absence) of such relation is readily expressible at one point of the derivation, but not later in the derivation. 39. We use the following notations for scope interpretations: (i)
a. X > Y = X obligatorily takes scope over Y. b. X >< Y = either X or Y may take scope over the other.
40. For expository purposes, we assume the scope of a to be understood as the set of categories C-commanded by a. 41. Strictly speaking, the scope of two women is the set of the terms of the category concatenated with two women. 42. We leave open the exact algorithm of scope determination. For relevant discussion, see, among others, May (1977, 1985), Aoun and Li (1989, 1993), Wyngaerd and Zwart (1991), Lasnik (1993), Hornstein (1994), Reinhart (1995), and Kitahara (1996a). 43. Given that the scope of every senator is clause-bound, the scope of every senator does not interact with the scope of two women determined after the raising of two women to the matrix subject position. 44. Lebeaux (1995:64) calls it a "trapping" effect since two women is "trapped" in the matrix subject position in order to bind each other at LF. He takes this "trapping" effect to be evidence for a single level of representation encoding both scope relations and anaphoric relations. In what follows, we advance a derivational analysis of (75), which makes no use of such a single representation. 45. This modification readily extends to Conditions B and C. 46. This analysis readily extends to the following case, in which herself takes Mary as antecedent (Brody 1995): (i) Mary wondered [how many pictures of herself] everyone painted t how many pictures of herself X everyone (i) can have either collective interpretation (i.e., how many pictures of herself takes scope over everyone) or distributive interpretation (i.e., everyone takes scope over how many pictures of herself) (see, among others, Cinque 1990 and Rizzi 1990). This scope ambiguity receives the following derivational analysis. If the scope of how many pictures of herself is determined before wh-movement, either how many pictures of herself or everyone can take scope over the other (yielding the scope ambiguity). If the scope of how many pictures of herself is determined after wh-movement, how many pictures of herself, occupying the specifier of the embedded C, obligatorily takes scope over the embedded subject everyone (yielding the collective interpretation). Condition A, applying after whmovement, interprets herself as coreferential with Mary. Notice that whether the scope of how many pictures of herself is determined before or after wh-movement does not affect this post-wh-movement application of Condition A. Thus, in the derivation of (i), herself can take Mary as antecedent, and the scope of how many pictures of herself can be determined before or after wh-movement. These two distinct applications of scope determination are responsible for the scope ambiguity, exhibited by (i). 47. For detailed discussion of the definitions of the LCA and the relevant notions, see Kayne (1994). See also chapter 5 for further discussion.
Application of Interpretive Procedures
83
48. We assume with Chomsky (1995:fn 108) that the hierarchical relation "asymmetrical C-command" entails the linear relation "precedence." 49. Under the linear order analysis, the difference between the complement clause and the relative clause may be recast as follows (Kayne 1994): (i) a. [DP the [NP claim [cp that [IP John was asleep]]]] b. [DP the [CP claim [C' that [IP John made t]]]] In (ia) the complement clause is taken to be a 0-related N-complement, whereas in (ib) the relative clause is taken to be a non-0-related D-complement. Given this structural/0-theoretic difference, (13) may be restated as follows: (ii)The introduction of 9-related elements must be cyclic, whereas the introduction of non-9-related elements can be cyclic or noncyclic. (ii) may be deducible if 0-relations are established and interpreted derivationally. That is, if a 9-related element were not introduced cyclically, some other category would take the place of that 9-related element, resulting in an unwanted 0-relation. Needless to say, further research is necessary to provide a more comprehensive analysis of the structure of the relative clause along with the derivational analysis of 9-relations. 50. Chomsky (1995:340) takes the LCA to be a principle of the phonological component that applies to the output of Morphology, optionally ignoring or deleting traces. In chapter 5 we will propose a somewhat different account. 51. See Kitahara (1997) for detailed discussion of the minimalist analyses of (79) (presented in Chomsky 1993, 1994, 1995). 52. The CED remains axiomatic in the Minimalist Program (Chomsky 1993, 1994, 1995). For recent discussion of CED effects under the derivational approach, see Toyoshima (1996). 53. Similarly, the linear order analysis (adopting the derivational definition of C-command) prohibits CHL from moving any phonetically non-null category downward or sideways in the overt syntax (as Chomsky (1995:255) notes). See chapters 4 and 5 for further discussion of such illicit movement. 54. Note that derivational marking mechanisms such as this are necessitated by the LF mediation requirement. See Heycock (1995) for relevant discussion. 55. Similarly, Chomsky and Lasnik's (1993) analysis of "island" effects violates the Condition of Inclusiveness because it invokes a star marking mechanism, which marks * on a trace (where t * is understood as an offending trace at LF). See Kitahara (1997) for a derivational chain-formation analysis, in which an already existing, independently motivated Case-theoretic distinction renders the star-marking mechanism dispensable.
3 Derivational Sisterhood
In the preceding chapters we examined the naturalness of C-command in a derivational approach to structure building and explored the consequences of a derivational approach to reconstruction, scope, and binding. To the extent that semantic interpretation relies on the C-command relation, as in the case of an interpretive approach to binding theory, the LF representation is at best superfluous; the derivational approach to semantic interpretation also has distinct empirical advantages, resolving certain cases of apparently conflicting C-command requirements that cannot be met simultaneously in a single level of representation, such as the level of LF (which is the only "level" of syntactic representation under minimalist assumptions). With respect to the phonological component, we saw as well that countercyclic Merge creates a set of C-command relations that is different from the set determined under a representational definition of C-command; from this difference follows the general impossibility of countercyclic Merge/Move to the extent that the PF branch of the grammar relies on C-command relations in the determination of linear order. In the following chapters we examine some extensions to the derivational approach. We begin with an investigation of operations internal to the syntax. First, we present a critique of Chomsky's (1995) analysis of feature-checking movement as Formal Feature (henceforth FF) raising. In particular, we argue that the special status of the specifier position in overt movement remains unexplained. We propose an alternative conception of the Checking Relation which is not an unexplained, stipulated definition of representational checking configurations but is, rather, a derivational relation of mutual C-command between categories, "Deriva84
Derivational Sisterhood
85
tional Sisterhood." We argue that Derivational Sisterhood is the most general derivational notion of locality. Representational sisterhood, in the traditional sense, is then but a subcase of this more general derivational relation of locality. We argue that this approach is conceptually simpler than the FF raising approach and makes better use of the derivational nature of Generalized Transformations. Thus while chapter 1 sheds light on the naturalness of C-comrnand, a potentially "long-distance" relation, this chapter extends the results to the as yet unexplored nature of local relations. We also informally introduce an analysis of movement as "remerger" of a category—that is, Move as nondistinct from Merge—which we develop further in subsequent chapters. In chapter 4, we will argue that a simple version of economy, Minimize Sisters, is natural to this derivational approach to feature checking; this account captures basic both Shortest Move effects and, in the general case, binary branching. Under this approach, the Spec-Head relation appears as an epiphenomenal, representational by-product of a derivational relation and thus need not be formally defined, just as C-command was shown to be a natural derivational relation among categories that does not need to be defined in terms of the structure of the phrase-marker containing those categories. In chapter 5 we examine the matter of linear order again and suggest a weakening of the LCA that gives results akin to trace theory. Such as development is necessitated if we are to take movement to be entirely nondistinct from the operation Merge, as suggested in chapter 3. An interesting side effect of the proposal includes the possibility of a head-parameter that is LCA-consistent, contra Kayne (1994). Finally, in chapter 6, we return to conceptual issues. We being with a formalization of the C-command relation. With some speculation, we suggest that the picture we have painted can shed some light on why human syntax has been for so long construed to be derivational in nature with some degree of empirical adequacy. In particular, we argue that derivationality accounts for the asymmetric relations required by the physical constraints of the perceptualarticulatory system, while maintaining the simplicity of symmetric structure-building. The approach also resolves a paradox inherent in the standard minimalist account of Greed with respect to feature-checking, namely, the question of why the legitimacy of
86
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
syntactic objects to the interfaces should be relevant syntax-internally. 3.1. FF-Raising versus Category Raising In chapter 4 of Chomsky (1995), Chomsky proposes a minimal formulation of the nature of feature checking. It is argued that if feature-checking is in fact the way to account for the phenomenon of displacement in natural language, then the simplest assumption would be that formal features [FF] alone, and not larger categories, enter into a local relation in order for checking to occur. Under the theory of Attract-a, formal FF-checking takes place exclusively via adjunction of the formal features [FF] of a lexical item to a functional head bearing the feature to be checked.1 The result of FF raising is always a chain CHFF = , where FF contains the feature F that does the checking. This theory is indeed extremely simple and natural, incorporating a straightforward notion of locality (put the features into the head of the category requiring checking) and of what undergoes movement (precisely the bundle [FF], which includes the feature that checks). Unlike representational definitions of Checking Domain such as those found in Chomsky (1993, 1994), which (though intuitively appealing and empirically fruitful) are clearly stipulative, the analysis gets right to heart of the matter of featurechecking. One question arises immediately: Why does the entire set [FF] move, and not simply the feature F that creates the Checking Relation, clearly the more minimal assumption? In other words, why does a moving feature F "pied-pipe" the entire bundle of formal features? And why then are not all features of the lexical item also pied-piped, including semantic and phonetic features (if any)? At least then, the atomic nature of lexical items would be maintained; it is, after all, lexical items, and not features, that are drawn from the lexicon and enter syntactic computation. Whether movement of [FF] is in fact the correct analysis would a matter for empirical research, though clearly a "nonminimal" assumption is being made.
Derivational Sisterhood
87
The situation becomes more complex when we consider that overt movement does result in displacement of more than just formal features. At a minimum, an entire X° category is displaced; maximally, a category YP is displaced, where even the head of YP may not contain the feature that checks, as in cases of pied-piping in relative clauses such as (1): (1)
the man [CP [DP pictures of whose+wh pet rock] C°+wh [ IP we had never seen t ]]
Here, overt feature checking entails the movement of the category DP, whose head is either a null determiner or possibly the lexical noun head pictures. The wh-word whose is clearly not the head of the phrase. If checking formally involves only the raising of a whfeature to C°, then why do we see category movement in the overt case? The resolution of this problem is taken to result from the following two assumptions. First, we assume that a lexical item will not be legitimate in the PF component (perhaps at the PF interface or as an input to morphology) if it is missing the bundle [FF] that has moved out of it. Thus, minimally, pied-piping of the lexical item that contained the feature bundle will be necessary. Additionally, morphological and other considerations (such as prosodic constraints or ECP-related requirements) may require the pied-piping of even more material for PF legitimacy. As an example, consider (1) again. The word whose is plausibly analyzed as bimorphemic, the structure of the DP object of of being [DP who [D° 's] [NP pet rock]]; thus movement of who alone would strand the affix 's. The two morphemes together, however, do not form a constituent; thus at least the entire DP whose pet rock must be pied-piped. Prosodic or other considerations within English might then also admit of pied-piping of the entire DP pictures of whose pet rock. Second, pied-piping creates at least a second chain in addition to CHFF, namely, the category chain CHCAT = , where a is the category that is pied-piped. For now let us concern ourselves with the case in which a is a maximal projection, putting aside head-movement (though similar considerations arise). In this case,
88
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
a copy of a is merged into the specifier of XP, such that the head X of XP contains the head of CHFF- By virtue of this second Merge operation, a PF repair strategy (or perhaps a syntactic repair strategy, admitted by the need for PF legitimacy) can replace the [FF] missing from the moved XP, putting it back into the lexical item from which it was extracted by FF raising. Thus the position Spec XP has a special status, being a position that is in some sense "local enough" for features to be returned from the head X of XP to the lexical item from which they moved. Numerous questions must be answered with respect to both the first and second assumptions. The first assumption is essentially a matter for further research: to what extent can morphological, prosodic, and ECP considerations correctly account for pied-piping and the cross-linguistic variation seen in overt movement? This issue is no longer construed to be a syntactic one; rather, the piedpiping that takes place in the syntax is seen as a reflex of other grammatical requirements of morphology, prosody, and PF legitimacy taken together. It is clear that any theory of overt displacement must account for the domain and range of pied-piping phenomena; this is an issue which goes beyond the scope of our analysis. However, a problematic issue arises here. Under a copy theory of traces, FF raising forms a chain CHFF = , in which both the head and tail of the chain are nondistinct copies of the same features. It is thus unclear why any repair strategy is necessary that must "return" the displaced [FF] into the head from which they were extracted, as there is already a copy of [FF] left behind by movement. We could, of course, abandon the copy theory of traces, but this would clearly deviate from minimal assumptions about movement, under which we expect that no part of a phrasemarker is altered by a syntactic operation except the target of the operation. Additionally, the copy theory of movement as it pertains to category raising is crucial to the minimalist analysis of reconstruction.2 To claim that category movement entails trace copies, while feature movement does not, would be stipulative, in the absence of a principled distinction. The second assumption, that the specifier position constitutes a local relation with a head, encounters more serious conceptual problems. Let us assume that [FF] must indeed be "returned" to the head from which it was extracted and that the head must be in a
Derivational Sisterhood
89
local relation with the displaced [FF] for this to happen. The question immediately arises as to why the specifier position is a more local relation than, say, the base position, to begin with. In (1), for example, why could the PF repair strategy not simply lower the raised [FF] in C° back into the lexical item who(se) from which it was extracted, eliminating the need for category movement in the first place? One might make appeal to a C-command requirement on all movement and thereby exclude downward movement of [FF]. But the example of (1) would then entail sideward movement of the feature back into the wh-word embedded in Spec CP, still in violation of the C-command requirement on movement. Adjacency might also be realistically invoked insofar as the repair strategy is a PF requirement which could plausibly admit of an adjacency condition; however, in an example like (1), pied-piping does not create adjacency between the raised [FF] in C° and the head who from which it was raised: the affix 's and the noun pet rock both intervene. Thus, though the intuition is clear that the position Spec XP should somehow constitute a local relation with the head X, it is in no way clear why this should be the case, nor is it clear how this locality should be formalized. Ultimately, then, the significance of the Spec-Head relation in overt movement must be stipulated. Thus while the general issue of pied-piping involves a very general set of questions that lie largely beyond the domain of syntax per se, the issue of the syntactic significance of the Spec-Head relation, under the FF raising theory of checking, remains mysterious and must be stipulated to be the required relation for overt movement. 3.2. Derivational Sisterhood: The Full Use of C-Command In light of the difficulties posed for overt movement under the Attract-F approach to movement, in particular the stipulations that it necessitates concerning the nature of the Spec-Head relation, it might be fruitful to return to a theory akin to GB or early minimalist (Chomsky 1993, 1994) approaches, in which relations between categories, rather than features, are the factors that lie
90
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
behind movement and/or checking. But the simple reintroduction of a set of stipulated definitions of relations of locality, such as the definitions of Checking and Complement Domain of Chomsky 1994, would be a return to precisely the sort of unexplained representational definitions we would like to avoid. In this section we explore how the local relation "sisterhood" can be construed in terms of the intercategorial relation we have argued is natural to the derivational system: C-command. Sisterhood, it turns out, is best construed as mutual C-command in a derivational sense. We will then argue that this generalized notion of Derivational Sisterhood will suffice to explain the significance of the Spec-Head relation in feature-checking. Current theories of syntactic relations appear to involve relations of a more local nature than C-command, as in, for, example Government, sisterhood, and M-command in GB theory, or Minimal Complement Domain and Residue (the Checking and Internal Domains, respectively) under the minimalist approach (Chomsky 1993, 1995). As discussed previously, the theory of FF raising also constitutes a notion of locality, though this notion is problematic when overt category movement is considered. Interestingly, FF raising adjoins features to a head, resulting in sisterhood between the head and [FF], insofar as the features and the head share a common dominating segment. Let us now reconsider sisterhood with respect to categories in general.
3.2.1. The Spec-Head Relation as Derivational Sisterhood The arguments here concern the establishment of local relations— that is, sisterhood—through movement. The core idea is that movement of a category is driven by the "need" of that category to be a sister to some feature-checking head and that sisterhood is a fundamental relation that falls out of the derivational approach. The conceptual naturalness of taking sisterhood to be the "local" relation between categories is seen to underlie movement as a syntactic primitive. Intuitively, the structural relation of "sisterhood" implies perhaps the strongest degree of locality. Accounts of modification,
Derivational Sisterhood
91
theta-assignment, and predication have long involved the relation of sisterhood between categories as a necessary structural relation. It is the sisterhood relation that we will take to be the fundamental local relation. We will show that the relation is a natural one given the derivational construal of C-command developed in chapter 1. Consider any two categories X and Y in a phrase-marker K created by a derivation D. Four relations may hold between X and Y: (2) 1. Neither X nor Y C-commands the other; 2. X C-commands Y, and Y does not C-command X; 3. Y C-commands X, and X does not C-command Y; 4. X and Y C-command each other.
These four relations constitute the only possible relations (ignoring for now possible relations based on Dominance) that X and Y may have with one another. We therefore expect the syntax to be sensitive to the distinguishing characteristics of these four scenarios. In the first case, no relation between the two categories obtains. In the second and third cases, asymmetric/unidirectional relations of dependency, such as binding and scope, are made possible. It is the fourth case that we will henceforth take to be the "local" relation: sisterhood. Relations between categories that amount to "mutual dependency," like the feature-matching inherent to Checking Relations, can therefore be established only in relation (4). Consider example (3). (3)
Merge(A, B)
C={A, {A, B}} (i.e., C=[AP A B])
92
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
We would like to say that A and B are now sisters by virtue of having undergone Merge. Now, by virtue of the single operation "Merge (A, B)," A and B C-command each other. Let us take sisterhood to be defined as mutual C-command. Under this definition, the fact that A and B were merged directly is incidental to the fact that they are sisters; rather, A and B are sisters because they Ccommand each other. Of course, any two categories merged together C-command each other under the derivational definition of C-command; thus categories sharing a common immediate dominator will be sisters, just as in the standard representational construal of sisterhood. Of course, we can define sisterhood in a representational manner and try to base a theory of locality on this definition. But the relation of C-command, natural to the derivational approach, already makes possible the relation 4 in example (2). Perhaps, then, it is the symmetric subcase of C-command that the computational system takes as the "more intimate" local relation. Let us now explore the consequences of viewing locality as symmetric C-command, derivationally construed. The incidental character of symmetric C-command in (3) becomes apparent when we examine cases of mutual C-command in a derivation that are not obtained via direct merger of two categories. We now argue that movement phenomena are driven by the need to establish mutual C-command via "remerger"—movement of a category within a phrase-marker to the root of that phrasemarker—precisely when such "remerger" results in mutual C-command as a property of the derivation. In other words, let us assume that a category can be merged more than once under certain conditions.3 Consider the partial derivation of the following structure, ignoring for now the possibility of an AgrO or "light verb" v projection:
Derivational Sisterhood
93
Consider the subject "S" occupying Spec VP. Two questions arise with respect to Case and Agreement checking: (1) why does the subject not check the Case-features of the verb V° in this configuration? (2) Why does movement to Spec IP appear to result in Case and Agreement checking, as in (5)?
Under the representational definition of Checking Relation in Chomsky (1993), the answer to the first question remains unclear: by definition, Spec VP is in the Checking Domain of V, and phrases in that position ought to be able to check features there. Stipulations have been made to overcome this problem, such as: (1) Case checking may take place only against a complex Case-plus-
94
A
A
DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
Agr head (such as [V° V° Agr°] or [7° T° Agr0]); (2) featurechecking may take place only via Checking Relation with a functional category (in particular, the light verb v; see Chomsky 1994, 1995); (3) feature-checking is defined as a property only of movement to a position that creates a checking configuration, not solely in terms of the checking configuration itself (see Chomsky 1994). Besides being stipulations, each of these possibilities suffers from certain drawbacks. Laka (1993), for example, gives arguments from Basque unergative constructions that Case checking of bare NPs takes place against the verb itself, without the aid of an Agr head, making the first and second stipulations empirically problematic if Laka is correct. Our second question, why Spec IP (as opposed to some other position) results in checking, is generally answered by stipulating a particular definition of "Checking Domain" (see Chomsky 1995 for relevant definitions). Given that the subject does not, for whatever reason, check Case in Spec VP, raising it to Spec IP puts it, by definition, into the Checking Domain of I°, where it can check Case- and Agreement-features. But answers of this nature are also stipulative: why is "Checking Domain" defined as it is? Following the derivational approach, it would be preferable to derive the representational definition, or something sufficiently similar to it that remains empirically adequate, from natural and independently motivated derivational considerations. We have seen that C-command is naturally construed as property of structure-building operations. In (4), I° was merged with VP, thus 1° C-commands the subject S in Spec VP, by virtue of that operation. Now, let us take Checking Relations to be relations of mutual dependence: each term in a Checking Relation is dependent on the other for feature-checking. We thus require mutual C-command to obtain in order for feature-checking to take place—that is, we require relation (4) of (2) to obtain. Now, subsequent "remerger" of the subject with IP (i.e., movement of the subject), as in (5), yields C-command of I° by the subject. As a property of the derivation D, then, movement of the subject yields Derivational Sisterhood with I°; that is, there is mutual C-command precisely
Derivational Sisterhood
95
because at one point in the derivation I° C-commands S, and at another point (after movement) S C-commands I°. (6)
The derivation D of (4) (ignoring construction of the possibly complex terms S and O):
Derivational Steps 1. Merge (V,O) 2. Merge (S, V') 3. Merge (I, VP) 4. Merge (S, I') (Move of S) Relevant ensuing sister relations:
Relevant C-Command Relations Established ... S c-cmds V° . . . I° c-cmds S ... S c-cmds I° S and v I° S and V° are sisters are not sisters
Note that in step 4, the term S (the subject) is concatenated for a second time, its first concatenation being found in step 2: merger with V'.4 In this case, S needs to enter into a local relation with I° in order to check the matching Case- and D-features of I. By viewing sisterhood as a property of the derivation, we explain movement as a means of creating the kind of locality natural to a system in which structure-building creates asymmetric relations: namely, as a way of gaining symmetric derivational C-command, e.g. Derivational Sisterhood.5 The Spec-Head relation is not a syntactic primitive: the primitive types of relations that we do expect to find are C-command relations. Note that the subject of Spec VP is not in a Checking Relation with V°, even though it is representationally in a "Spec-Head" configuration with V°. The subject C-commands V°, but not viceversa. Thus no Derivational Sisterhood (= mutual C-command) between these two categories obtains in this derivation, and hence no Checking Relation obtains. We achieve, without stipulation, the desired result that the subject in Spec VP is not in a Checking Relation with V0.6
96
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
To summarize, let us offer the following natural derivational definitions of sisterhood and the Checking Relation: (7)
Derivational Sisterhood X and Y are Derivational Sisters in a derivation D iff i. X C-commands Y at some point P in D, and ii. Y C-commands X at some point P' in D (where P may equal P').
(8)
Checking Relation X is in a Checking Relation with a head Y° in a derivation D iff i. X and Y° are derivational sisters, and ii. Y° bears an uninterpretable feature identical to some feature in X.
Feature-checking, loosely following Chomsky (1993,1994,1995), results in the deletion/erasure of the uninterpretable features shared by X and Y. 3.2.2. Locality and Checking Relations Given the analysis proposed so far, we are lead to the conclusion that a Checking Relation may obtain between a head and its complement, since they invariably C-command each other. With respect to verbs and their objects, we now confront the possibility that Case is checked in situ, via Derivational Sisterhood between object and verb.7 Such an analysis is, of course, entirely precedented, since, in pre-minimalist theories of Case assignment, object position was taken to be a position to which Case could indeed be assigned. Chomsky's 1994 unification of all Case checking/assignment under the representationally defined (stipulated) Spec-Head configuration is by contrast expressed in this framework as a unification of all Checking Relations under Derivational Sisterhood. Thus unification, albeit distinct, is nonetheless preserved. There are direct empirical advantages to this approach. We will now provide an analysis which allows two desirable unifications concerning the parametric value of feature-strength in the grammar of Icelandic.
Derivational Sisterhood
97
First, consider the Case-feature of T in the following Transitive Expletive Construction (9), from Jonas and Bobaljik (1993): (9)
a. [Agrsp mun [TPeinhverT° [ VP hafa [ VPt (einhver) boroao petta there will someone have eaten this epli]]]] apple "Someone will have eaten this apple."
b. *
PETTAepli]]]] EPLI]]]] [AgrSP-pao mun [TP T0 [VPHAFA [VPVPEINHVER einhver BOROAO boroao petta there will have someone eaten this apple
In (9) we find that the subject einhver must raise overtly to Spec TP from its base position in Spec VP; it cannot remain in Spec VP, as suggested by (9b). We follow Jonas and Bobaljik's analysis of the obligatoriness of this overt movement, assuming that it is a result of the need to check the strong N-feature of T°—that is, its Nominative Case-feature. Under this analysis, the expletive occupies Spec AgrSP; thus the subject is raising only to check only the Casefeatures of T and not potentially strong Agreement- or D-features of AgrS°. We may motivate the appearance of this overt movement by hypothesizing that the Case-feature of T° is strong. But while the subject moves from its hypothesized base position in VP, there is little evidence that the object has shifted in such cases; in fact, object shift out of VP is essentially optional in Icelandic, its only prerequisite being verb-raising (see Holmberg 1986; we will not discuss reasons for this optionality here, but see Groat and O'Neil 1995). Thus object shift is not possible in examples like (10): (10) *[AgrSP einhver1 mun [TP T° [vp hafa [AgrOP petta eplij [AgrO°boroaov someone will have this apple eaten AgrO°][VP ti tv tj ]]] "Someone will have eaten this apple."
98
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
If Accusative Case is checked in Spec AgrOP, then we must say that the Case-feature of V is weak; otherwise, overt object shift to check it would be obligatorily overt. One might posit that the object has in fact shifted to a VP-internal Spec in (9a), where it checked Case, followed by raising of the participle to a higher V-projection. But we have found no independent evidence for such string-vacuous movement. It would be far simpler to assume that no object shift has taken place in (9a). Thus, although both V and T bear Case-features, apparently only T bears strong Case-features. However, under our analysis, we can posit a more abstract and general feature "Case," which has one and only one strength specification in the grammar: in Icelandic, it is strong. In fact, the notion of abstract Case divorced from exact feature specification has been implicit in many analyses of Shortest Move constraints, such as Chomsky (1993, 1994, 1995), Ferguson (1993a), Ferguson and Groat (1994, 1995), and Kitahara (1994).8 We expect, then, that this abstract feature receives exactly one specification for strength. In (9a), the object checks Case in situ, under the Derivational Sisterhood relation with the verb that results directly from merger of the object with the verb. The subject, contrastingly, must raise to a position Ccommanding T°, in order to obtain Derivational Sisterhood with T°, and thereby a Checking Relation. When we do find overt object shift to Spec AgrOP, or for that matter, subject shift to Spec AgrSP, it is a result of the need to check Agreement (or perhaps categorial D-features), not Case-features. Checking under Derivational Sisterhood allows us to capture the abstract feature "Case" without needing either to postulate string-vacuous object shift to a VP-internal specifier for Case checking or to postulate that the object shifts covertly to check a weak Case-feature of V.9 A similar unifying result can be achieved with regard to the Vfeatures of Agr. In Icelandic, the inflected verb always raises at least as high as AgrS°, as we see in (9a). This suggests that the Vfeature of Agr is strong in Icelandic. However, unless we assume that AgrO° differs from AgrS° in its specification for V-feature strength, it becomes difficult to explain why participles do not likewise raise to AgrO°, just as inflected verbs raise to AgrS°. The ungrammaticality of (11) suggests that participles do not in fact
Derivational Sisterhood
99
raise out of their base position in VP, since they may not appear to the right of the adverb: (11) * [AgrSP einhver1 mun [ TP [VP hafa [Agrop [AgrO° boroaov AgrO°] someone will have eaten [VP ekki [VP ti tv petta epli]]]]]] not this apple "Someone will not have eaten this apple." But if Derivational Sisterhood between a head and some other category suffices for feature-checking, as hypothesized in (8), then we can rule out (11) as a violation of Greed. AgrO° was merged with VP and is therefore a derivational sister to VP. Since VP is a projection of the verb, it bears the features of the verb. Thus Derivational Sisterhood obtains between a feature-checking head AgrO° and a category bearing features of the same formal type, VP. Checking obtains without movement of the verb. But this means that movement of the verb, as in (11), is superfluous, since the features that it could check against AgrO° have been checked by direct merger of VP with AgrO°. The movement is thus prohibited by Greed, correctly predicting the ungrammaticality of (11). Now we can additionally say that the V-features of all Agr's are strong in Icelandic, the desired featural unification, without empirical difficulty. Thus there is evidence that, as in earlier theories, the Head-Complement relation may be a Checking Relation.10 To summarize this section, we suggest that having an explanation of the intercategorial relation C-command encourages us to define other relations in terms only of C-command. Given two categories, the only more "complex" relation that arises without stipulation is symmetric C-command—that is, Derivational Sisterhood. Given two categories merged directly, Derivational Sisterhood appears no different from sisterhood in the traditional sense: they are in a sense "twins." But if the categories C-command each other by virtue of two distinct operations in the derivation, the categories nonetheless are in a symmetric relation. Thus movement may be seen as nothing more than the establishment of sisterhood,
100
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
sisterhood being defined as mutual C-command. Since C-command is a natural property of derivations, it follows that Derivational Sisterhood is a natural way to capture locality in a derivational system. Derivational sisterhood as a property of terms in a derivation provides us with: (1) a simplification of the analysis of Icelandic feature-strength specification, and (2) a unified analysis of the relation required for checking that subsumes both Spec-Head and Head-Complement relations.
3.2.3. Deriving Proper Binding Condition Effects In this section we will show how the derivational view of sisterhood prohibits both downward and sideward movement of categories, without the need to stipulate a C-command requirement in the definition of Move. Importantly, no appeal to the Strict Cycle Condition (see Chomsky 1973, 1993, and Kitahara 1994) need be made; this becomes an important matter for covert operations which are not constrained by the Strict Cycle Condition. In theories of movement it has proven important to postulate a C-command constraint either on the movement operation (ruleapplication) itself or on the resulting representation. Such restrictions, in their many forms in the literature, have been dubbed the "Proper Binding Condition," henceforth referred to as the PBC (see Lasnik and Saito 1984 for citations; see also Collins 1994 and Fiengo 1977). The PBC ultimately restricts "sideward" and "downward" movement of phrases. The examples in (12) and (13) exhibit apparently Greed-consistent sideward and downward movement, respectively: (12) a. [ CP C°[ +wh] [John likes Mary]] doesn't bother who b. [CP who [c' C°[+wh] [John likes Mary]]] doesn't bother f(who). **"Who John likes Mary doesn't bother." (13) a. who wonders [CP C°|+wh] [IP Mary likes John ]]
Derivational Sisterhood
101
b. t(who) wonders [CP who [ c' C°| +wh | [IP Mary likes John ]]] **"Wonders who Mary likes John."
In (12), we see sideward movement of a wh-phrase from object position to the specifier of a [+wh] CP subject. (13) illustrates downward movement of a wh-subject to the specifier of a [+wh] CP complement. Such cases, however, surely result in "semantic gibberish," even if they converge syntactically: in both (12) and (13), who takes scope over an embedded clause that does not contain a variable. Furthermore, the derivations would appear to be countercyclic; we restrict the discussion for now to cases of cyclicity at least in the overt syntax. In fact, it is difficult to construct cases of cyclic overt movement in which the moved element does not representationally C-command its trace. However, the difficulty with constructing such cases rests on a particular assumption about the operation Move: namely, that the moved element is a term of the phrase-marker that contains the head against which features are checked. In Chomsky's approach to Generalized Transformations, Move has been explicitly defined as a Singulary Transformation operating on two terms within one phrase-marker (see Chomsky 1995). But there is in fact no a priori reason to impose such a restriction: taking movement, as assumed previously, to be simply "remerger" of a term already merged into a new position, the computational system should allow a term to merge once in one phrase-marker and again in some distinct phrase-marker. Bobaljik and Brown (1997) have proposed precisely such an analysis for the case of head-movement, which appears to be inevitably countercyclic. We thus propose to eliminate two assumptions: (1) that a Ccommand condition on movement is required; (2) that movement may not be "interarboreal;" i.e., it must target a phrase-marker that contains the moving category. We now proceed to construct an example of interarboreal movement that is correctly ruled out without recourse to assumption 1.11 Consider first the derivation of the following sentence: (14) [cp For John to fail to buy roses] forces [TP Mary [T° to] [VP t(Mary)
102
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
reconsider]]. "For John to fail to buy roses forces Mary to reconsider."
Note that the construction of CP is "unordered" with respect to the construction of the VP; they may be constructed in parallel. Merger of the VP with T° to introduces a head T° (= to) bearing a strong Case-feature, or perhaps a strong D-feature, which must be checked (the "EPP" effect). If movement is restricted to terms within the phrase-marker containing the target of move (in this case, T°), then the only term that could move to Spec TP would be the DP Mary, as shown. In a cyclic derivation, the independently constructed CP for John to fail to buy roses can merge with [forces + TP] only after TP has been merged with the verb forces; thus neither the DP John nor the DP roses could satisfy the EPP of the lower clause, since the sideward movement involved would be countercyclic. However, there is no reason that John, for example, could not be merged into Spec TP before CP is merged into the main clause. In other words, we might reach a point in the derivation at which we have the following two phrase-markers: (15) a. [CP for John to fail to buy roses] b. [ [To to] [VP Mary reconsider]]
Now let us "remerge" John with the phrase-marker in (15b). We are left, still, with two phrase-markers, with John residing in both: (16) a. [CP for f(John) to fail to buy roses] b. [TP John [To to] [VP Mary reconsider]]
No countercyclic movement has taken place in (16), and John is now in the specifier position that should, representationally, constitute a checking configuration with T°, and the EPP of the lower clause should be satisfied. Continuing the derivation, we would end up with the following: (17) [for f(John) to fail to buy roses] forces John to Mary reconsider * "For to buy roses forces John to Mary reconsider."
LF raising (presumably of the trace of John to Spec PP) of the trace
Derivational Sisterhood
103
of John will then check its Case-feature, and Mary can raise to Spec AgrO of the main clause to check its Case-feature. Thus there seems to be no way to rule out this derivation: it is cyclic, convergent, and not gibberish. Of course, we can impose (stipulate) C-command requirement on movement, or chains, or output representations or levels, to rule out such unwanted cases (see most recently Chomsky 1995:253) or require that a phrase undergoing movement must originate in the phrase-marker that contains its target (in other words, stipulate that movement is a Singulary Transformation); either stipulation will rule out this type of derivation. But we need not impose such restrictions, which would amount to stipulations, under the assumption that checking obtains only via Derivational Sisterhood: John would not, under derivational C-command, become a derivational sister of T°. It would C-comrnand T° by virtue of merging with a phrase-marker containing T°, but at no point in the derivation did T° C-command John; thus the remerger does not result in Derivational Sisterhood. So no checking obtains, and the derivational step of remerging John into the outside phrase-marker is precluded by Greed. The example in (12) can be ruled out in the same way, independently of the fact that it constitutes LF "gibberish": though who could move to Spec CP of the embedded CP before the CP is embedded (thus cyclically) via simple remerger with this distinct phrase-marker, such movement would entail C-command of the [+wh] C° by who, but not the converse—hence no Derivational Sisterhood, hence no movement by Greed. The downward movement in (13) is out simply on the grounds that it is countercyclic and overt (see chapter 2, section 6; see also chapter 5). In general, then we see that movement across phrase-markers to the specifier of some feature-bearing head H°, which is the only way for a derivation to entail cyclic sideward movement, is ruled out on the grounds that the moving category does not enter a Derivational Sisterhood relation with that head H, and no checking can occur. Note that sideward movement to specifier position is similarly banned in the case of covert movement which might be noncyclic: the fact remains that Derivational Sisterhood is not established by the move, and hence the move could not check features, period.
104
A
DERIVATIONAL
APPROACH
TO
SYNTACTIC
RELATIONS
Consider, however, sideways movement of XP from one phrase-marker to the complement position of H° in another phrasemarker. Since H° and XP would be merged directly, mutual Ccommand and hence Derivational Sisterhood would obtain; thus the movement constitutes a possible feature-checking operation and should be licit. Furthermore, such movement could be cyclic if H° has not yet merged with anything else—that is, it constitutes a phrase-marker consisting of exactly one term. In general, such derivations will be precluded, since the complement of H° is by hypothesis in a selectional relation with H°; thus, sideways movement of any category to the complement position of the featurechecking heads {C, T, P, v, V, Agr} for the purpose of featurechecking would make it impossible for these heads to select any further complement. In other words, merger of any category XP with a head H° such that H° and XP check features can only take place if XP is the only phrase selected by H°, and hence in general only if XP is not moving from some other position. We can discover no cases in which sideward movement to the complement of H° does not result in selectional problems. There is one interesting exception, however: head-movement of X° to H° entails mutual C-command of X° and H°, since they are merged together directly. Thus we predict that head-movement across phrase-markers is possible in principle, since affixation and the creation of a complex head, rather than the selection of the complement of a head, is involved. This is a desirable result, as it allows us to maintain strict cyclicity without an exception for headadjunction. As an example, consider overt raising of a verb to T°, as in French: (18)
[TP Jean [T° arrive T°] [VP t(Jean) t(arrive) ]] John arrives "John arrives."
If movement were restricted to Singulary Transformations, then the verb could only merge with T° after T° had merged with VP, since only then would T° and arrive be in the same phrase-marker. But then the derivation is noncyclic, as we have merged arrive with a constituent T° internal to the phrase-marker TP. Chomsky's origi-
Derivational Sisterhood
105
nal proposal concerning such cases was to except adjunction from strict cyclicity (see Chomsky 1993, 1994), but this is a stipulation that can, under the approach sketched out here, be dispensed with. Consider the point in the derivation in which T has been selected from the Numeration and the VP has been constructed (the subject Jean is still in VP; we ignore for simplicity the possibility of an AgrO or vP projection). We thus have two phrase-markers: (19) a. [T°] b. [VP Jean arrive]
We would expect the next step in the derivation to be merger of T° with VP, but this would entail that subsequent movement of arrive to T° is countercyclic. Instead, let us remerge arrive with T° before merging T° with VP: (20) a. [To arrive T°] b. [VP Jean t(arrive)]
Now, since T° and V° have been merged directly, they C-command each other and are thus derivational sisters. Therefore, a Checking Relation obtains, and the operation is licit. Now, the derived complex T° in (20a) may merge with the VP, forming (21): (21) [TP [To arrive T°] [yp Jean t(arrive)]]
T° now C-commands Jean by virtue of this operation and takes scope over the VP; Jean may now merge with TP to C-command T° and thereby check D- and Case-features, completing the derivation of (18). The derivational construal of local relations as mutual C-command thus yields desirable restrictions on movement, without the stipulation of a C-command requirement either in the definition of Move or on resulting representations. Nor need any stipulation be made to the effect that categories cannot move "across" phrasemarkers in a derivation. All that is needed in addition to the natural construal of Derivational Sisterhood as a requirement for fea-
106
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
ture-checking is the principle Greed: movement must result in a Checking Relation. 3.2.4. Chametsky (1996) on Other Command Relations Chametsky (1996) develops a representational account of C-command. Much in the spirit of chapter 1, he argues that the relation is a natural property of the computational system; however, his construal of the system is a representational theory of the base component, rather than a derivational system based on Generalized Transformations. In chapter 6 we will critique his analysis of Ccommand. Chametsky (1996:49-51) also considers the status of intercategorial "command" relations other than C-command. In particular, he looks at the case of M-command, which is of particular interest here, since Chomsky's (1995) notion of Checking Domain is closely related (and perhaps rightly considered historically as the heir of) the relation of M-command. Consider the following definition of M-command: (22)
X M-commands Y iff i. every Zmax dominating X dominates Y, and ii. X does not dominate Y.
Chametsky notes descriptively that the M-command relation is a superset of the C-command relation when viewed in terms of the set of commanders for a given category. Thus if X C-commands Y, then X M-commands Y, though the converse does not necessarily hold. He goes on to suggest that his analysis of C-command, in which C-command is defined in terms of the C-commanded category instead of in terms of what categories a given category X Ccommands, proves us with the possibility of this insight. His argument is that we should expect all linguistically significant relations to be supersets of C-command; thus it is no surprise that M-command is one of them. However, it is not clear that this point of view provides us with any way of deriving the significance of relations other than C-com-
Derivational Sisterhood
107
mand. When we take C-command to be a binary relation—that is, a set of ordered pairs—it is immediately obvious that the set of Mcommand relations is a superset of the set of C-command relations under either a representational or derivational definition of C-command. Furthermore, that M-command is a definable superset of the C-command relation does not appear in itself to be theoretically significant: the easily defined relation "X is in the same tree as Y" is also a superset of C-command, yet it bears no particular status as such. Thus it appears that any given definition of C-command does not shed further light on the significance of M-command. Furthermore, M-command itself does not capture the apparent significance of the local character of the Spec-Head relation, since M-command of a phrase XP by a head H obtains whenever XP is C-commanded by H. This is not necessarily a local relation at all, as XP might be embedded many clauses down in a structure Ccommanded by H. Thus, much like Chomsky's 1994 definition of the "Domain" of a head H, a subset of the M-command relation must be (arbitrarily) defined (such as Chomsky's "Checking Domain") in order to capture the more local Spec-Head relation. Since the local Spec-Head relation therefore appears to be neither a superset nor a subset of the C-command relation, an approach to syntactic relations that seeks linguistic significance in relations that are a superset of C-command does not appear to shed any light on a natural characterization of locality in human syntax. 12 That M-command, or something akin to it such as Checking Domain, is linguistically significant is certainly a result that would be advantageous to capture "nonlinguistically"—without recourse to nonminimal assumptions or empirical necessity, as Chametsky observes. The importance of the Spec-Head relation in GB and minimalist theories serves to underscore the significance of such a relation, but on a linguistic level. The derivational approach to sisterhood does appear to shed light on the relation: Derivational Sisterhood is precisely the subcase of the syntactic relation C-command. Thus, if a "local" relation is to be construed as a relation in which two categories are dependent on each other, then Derivational Sisterhood emerges as a natural way of characterizing local relations.
108
ADERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
3.3. Summary In this section we examined one property of the derivational C-command relation: it is not antisymmetric—that is, two categories may C-command each other. One case in which a pair of categories C-commands each other is when the two categories are merged with each other; viewed representationally, this constitutes a "sisterhood" relation. Taking this relation to be a "local" relation when viewed derivationally, and generalizing the case of mutual Ccommand to cases in which a category is merged more than once— that is, cases of movement—we can then analyze local relations such as Checking relations as nothing more than mutual C-command, derivationally construed, or Derivational Sisterhood. No stipulated representational definition of a local relation, such as the Spec-Head relation,13 is necessary. One consequence of this analysis is that the Head-Complement relation is potentially a Checking Relation (though it remains formally nondistinct from cases of Spec-Head Checking Relations, each falling under Derivational Sisterhood); another consequence is that PBC effects follow without stipulation, even if we free movement to nonsingulary transformations ("interarboreal" movement, across trees).
Notes 1 .The EPP is hypothesized to be a different case, perhaps outside the system of syntactic feature-checking, in that it can be satisfied via merger of a category bearing D-features into Spec XP, where X is a functional category bearing a strong D-feature; thus FF raising is not necessarily involved, as in the case of expletive construction in which the expletive satisfies the EPP of T by direct merger into Spec TP. See Groat (1997) for a different view; see also note 6. 2. Under the derivational approach to reconstruction effects suggested in chapter 2, this is not necessarily an issue, since "reconstructed" positions do not need to be available at the LF level if interpretation proceeds derivationally; "reconstructed" positions merely correspond to intermediate points in a derivation at which the interpretive mechanisms of scope/binding theory apply. But the issue surrounding copies of [FF] in FF raising still remains. 3. Whether and how such an approach differs from a copy theory of traces will be discussed in chapter 5. Note that no notion of a "copy" is being invoked. Instead, a term that moves is simply the input to more than one structure-building rule; one that puts it into a structure, and at least one more that puts it elsewhere in the structure.
Derivational Sisterhood
109
4. We represent the output of merger with a head in this derivation to be a barlevel category. In fact, if we follow the bare theory of Chomsky (1994), the output is a VP, since it is at that point in the derivation the maximal projection of the verb. Thus step 2 of the derivation in (6) would more accurately be Merge (S, VP). To avoid confusion we simply use the category levels that the terms end up with at the end of the derivation to refer to them in the derivation in (6); nothing in the present discussion hinges on the notion of minimal, maximal, or intermediate levels of projection. 5. Again, no notion of "copy" is being invoked here: there exists a term S, and this term may be merged wherever and whenever required. As seen in chapter 2, the copy theory is not crucial to the issue of reconstruction under the derivational approach to binding and scope phenomena, since the relevant C-command relations are interpreted derivationally, and binding is assumed to apply derivationally as well. The issue will reemerge in chapter 5 in our discussion of the LCA. 6. In fact, the Derivational Sisterhood approach to checking entails that in general, direct merger (i.e., not movement) of a category XP into the specifier of a head H does not create a Checking Relation between XP and H. Interestingly, one case of direct merger as checking arises in Chomsky' s analysis of the checking of EPP features by direct insertion of an expletive into Spec TP. Under our approach such an analysis will be impossible. There is, however, evidence that expletives do undergo movement from a position below T; see den Dikken (1995), Moro (1997), and Groat (1997). If this is the case, then mutual C-command (hence Derivational Sisterhood) would obtain by raising of the expletive to Spec TP. 7. Recently, Chomsky (1995) has explored the possibility that it is not V, but a light verb v, of which VP is the complement, that bears Case-features. Under such an analysis, the object will always have to shift (either overtly or covertly) in order to check Case, since only through remerger will Derivational Sisterhood between v° and the object entail. Though there is certainly evidence that objects do shift to positions above VP, the connection with Case checking is tenuous, and our analysis will assume that there is no such connection. 8. An alternative approach might make use of the hypothesis that EPP effects follow from the need to check categorial D-features of T, following Chomsky 1995 (see also Ura 1996), and that this is the strong feature driving movement to T, rather than a strong Case-feature. It is also possible to construct a similar analysis of the Icelandic sentences at hand following the "Agr-less" theory of functional projections; for ease of presentation we make use of a system that includes Agreement as a projecting functional head. 9. The differing morphological forms for Accusative and Nominative Case may be seen as resulting not from different types of Case-features, but from the particular category checking the feature. Thus V [+Case] checks Accusative, T [+Case] checks Nominative, etc. 10. The Head-Complement relation has long been presumed to be a Case assigning relation in GB theory. Note, however, that this relation has not been conceived as a "checking" relation per se, but as a structural relation that licenses certain categories (NP, DP, wh-trace etc.). This difference is not relevant to the discussion at hand. 11. An important difference relevant to the analysis in chapter 2 of binding out
1 10
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
of PP should be pointed out here: Under the Derivational Sisterhood approach, no covert formal-feature raising is required, since objects of prepositions can check Case in situ, and overtly. This difference, however, does not necessarily cause a problem. That is, the central aspects of the analysis in chapter 2 can be maintained if the stripping of phonetic features follows the application of subject-raising. See Groat (1997) for further discussion of raising constructions and binding effects under the Derivational Sisterhood approach. 12. Interestingly, assumptions 1 and 2 are two stipulations that distinguish Merge from Move; without such stipulations, they would be nearly identical instantiations of concatenation. The remaining difference is the question of Greed; we return to this question in the coming chapters. 13. Aoun and Sportiche (1983) argue that Government amounts to mutual Mcommand, a notion akin to that of mutual C-command. However, M-command does not appear to be a syntactic primitive, while C-command is, following chapter 1. 14. Spec-Head being understood under whatever structural definition might be in use; i.e., as a case of a Checking Domain configuration, or as M-command.
4 Minimize Sisters and Constraints on Structure Building This chapter develops a theory of local economy of the kind motivated in Chomsky and Lasnik (1993), Chomsky (1994, 1995), and others, to explain Relativized Minimality effects as analyzed by Rizzi (1990). We again take as central the derivational construal of C-comrnand, and argue that Derivational Sisterhood is at the core of syntactic relations that are quantified. Through this approach we hope to show that both Shortest Move metrics and binary merger are the result of this single, natural economy constraint on derivational operations. We conclude with a discussion of the status of intermediate-level categories, suggesting that they can be eliminated by allowing n-ary branching structures while preserving both binary Merge and the desired asymmetric C-command relation between specifier and complement.
4.1. Derivational Economy A simple question emerges if the concept of "economy" in the context of syntactic computation, or the linguistic system in general, is to be explored: "What properties or entities exist in the system that are quantifiable?" There are several possibilities. One candidate is the number of rule-applications in a derivation. If the computational system minimizes the number of steps in a derivation, comparing different derivations given a Numeration N, then we have a form of global economy of the type postulated in Chomsky (1994). We can, for example, assume that economy constrains derivations as stated in (1): 111
112
(1)
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
Global Economy Minimize the number of rule-applications in the Derivation generated given a Numeration N.
Thus if two derivations D1 and D2, each of which is based on the same Numeration N, are both convergent, and each step of D1 and D2 is licensed—that is, each step is either Select from the Numeration, Merge, or Greed-consistent "Remerge" (= movement), then whichever of D1 and D2 has the lesser number of derivational steps (i.e., rule-applications) is the only derivation permitted by the system. The drawback of such an economy metric is that it can lead to extreme computational complexity if the number of possible derivations given a Numeration is high, since all derivations must be computed and compared. Ferguson and Groat (1994, 1995), Ferguson (1994a, b), Poole (1996), Collins (1997), and Oishi (1997) provide various versions of this argument against a global economy metric. 4.2. Local Economy Another candidate for economy is found on the local level. A local economy constraint does not examine an entire derivation but, rather, the possible operations given a particular point in a derivation, choosing the "cheapest" among them as the only licit operation(s), "cheapest" again being understood as a measurement of some quantifiable property of the operations under consideration. Now, given an application of Merge/Move, the quantifiable properties that might vary are the number of C-command relations and the number of Derivational Sisterhood relations created. But at first sight, if Merge has, as a property, the establishment of syntactic relations, then it would appear always to be cheapest to merge nothing, thereby creating zero relations. If creating relations entails a cost, why is structure-building allowed at all? To answer this question requires us to consider the possibility raised in chapter 2: that postsyntactic computation/interpretation (i.e., semantic interpretation and PF computation) takes as its set of
Constraints on Structure-Building
113
instructions not an output phrase-marker (such as an LF representation or a phrase-marker that is "Spelled-Out"), but the derivational operations themselves. For example, to the extent that certain aspects of semantic interpretation require information concerning whether or not one category C-commands another (as in, say, scope or binding theory as examined in chapter 2), semantic interpretation must, in our system, have access to the derivational operations that determine whether or not the relation holds. Thus semantic interpretation must "see" the derivation, and thereby "see" the operations that yield/do not yield the C-command relations in question. It is thus possible to conceive of a constraint on Merge similar to the Greed constraint on movement: Merge, like Move, must satisfy some interpretive or syntactic requirement in order to be licensed. In fact, given our developing conception of movement as "remerger" of a category, defining Move to be nothing more than Merge applied to a category beyond its first Merge into a phrasemarker, we expect Merge and Move to be identically constrained: neither is licensed unless the operation receives some legitimate interpretation from postsyntactic components (i.e., semantic interpretation and the PF branch of the grammar). As an example, the merger of a verb V and an object DP results in a number of legitimate interpretive consequences: the uninterpretable Case-feature of the DP is checked via sisterhood with the verb, the verb discharges (one of) its internal theta-role(s), and (under a theory incorporating some version of Kayne's LCA),1 the PF component can interpret the order of the verb with respect to the lexical items within the complex DP. Simply put, then, syntactic operations such as Merge are licensed by nonsyntactic/interpretive requirements. If a given requirement such as the discharging of a verb's theta-role or the checking/deletion of an uninterpretable feature entails an operational cost, then as long as that cost is minimized (in a sense to be made precise), the operation satisfying the requirement must be allowed. Thus we can assume that no syntactic operation is licit that neither receives a semantic interpretation, provides a valid instruction to PF computation, nor results in a Checking Relation. So the question now becomes: "When Merge/Move is applied, how might
114
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
economy conditions locally constrain the set of possible applications?" 4.2.1. Derivational Sisterhood Relations Entail a Cost One quantifiable relation established by Merge/Move in our system is Derivational Sisterhood. This relation "stands out," as it were, from simple C-command in that through Derivational Sisterhood purely formal features may undergo checking (and thereby deletion/erasure). Unlike asymmetric C-command, which can receive an interpretation by postsyntactic interpretation/computation but does not result in any change in the makeup of the lexical items internal to the syntax, the Checking Relation may alter the feature content of the lexical items from which syntactic structures are built. By virtue of the importance of the Derivational Sisterhood relation to the syntactic operation of feature checking, the establishment of any pair of derivational sisters entails some degree of computational complexity: matching and checkable features of the sisters must be deleted/erased if they are uninterpretable. The syntax must therefore "inspect" each pair of sisters for the possibility of checking. Perhaps, then, it is this subcase of C-command relations—the establishment of local relations via Derivational Sisterhood—that is constrained by some local economy metric. Let us pursue the matter generally. Consider first the following Merge operation: (2)
Merge (X, Y)
(projecting X)
Constraints on Structure-Building
115
There are three terms in this structure: X°, XP, and Y°/YP. Let us assume some category Z must Merge with one of these three terms (to receive a theta-role, check a feature, etc.). Note that whichever of the three categories it merges with, exactly one pair of derivational sisters will result: Z°/ZP, and the category it merges with. If we take local economy to constrain the number of Derivational Sisterhood relations created, then all binary merger is equally costly: all binary merger creates two sisters. In this example, cyclic merger of Z with XP, being no more costly than countercyclic merger with X° or Y°/YP, becomes the option which much be chosen. (3)
Merge (Z, XP)
(projecting X)
We therefore propose the following economy metric: (4)
Local Economy Minimize the number of derivational sisters created by Merge/ Move.
This would appear to be a completely natural local economy metric, since the only quantifiable property of Merge/Move is C-command (thereby including the possibility of Derivational Sisterhood), and the only case of C-command that entails computational complexity for the syntax is Derivational Sisterhood. The next two sections
116
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
explore the empirical consequences of constraining Merge/Move by this economy condition. 4.3. More on the Spec-Head Relation Now let us turn to the rule Move, which we again maintain is simply remerger of a category. We will not delve into the various formulations of Greed that have been proposed to constrain admissible movement operations (see Chomsky 1993, 1994, 1995, Ferguson and Groat 1994, Collins 1995; 1995); for now we will assume the following formulation in (5), paraphrased from Chomsky (1995): (5)
Greed Move (= Remerge) is licit only if a Checking Relation obtains.2
Now consider the movement of a category XP for feature-checking. Such movement, as we have seen in section 1, must establish a C-command relation with a head that already C-commanded it, thereby establishing mutual derivational C-command—that is, Derivational Sisterhood—and hence a local Checking Relation. Suppose that some C-commanding head H° bears a feature F that needs to be checked and that XP bears feature F. But now imagine that some higher head G° C-commands H°, as in (6):
Movement of XP to Spec GP would entail C-command of H° by XP; thus a Checking Relation would result, since H° and XP would be derivational sisters. Feature-checking would then obtain as long as the features of H° matched the features of XP:
Thus given our derivational definition of sisterhood and the hypothesis that the Checking Relation requires sisterhood, we in fact fail
Constraints on Structure-Building
117
to capture the desired locality conditions on movement: XP in (6) could move to any position C-commanding H°, and a Checking Relation would obtain, since movement to any position that C-commands H° would result in Derivational Sisterhood between XP and H° and would thus be a licit feature-checking operation. However, notice that this long-distance movement entails the creation of Derivational Sisterhood with both G° and H° (as well as with G' and any other terms "between" H° and XP). Under the Minimize Sisters economy constraint, it would be cheaper to move XP to the specifier of HP by merging it with HP, since no Derivational Sisterhood relation with G° would ensue:
Such movement would, of course, be countercyclic — fine for covert movement, but resulting by hypothesis in a nonconvergent derivation if overt. Overt movement to check strong features of H° thus must precede merger of HP with G°. We thus derive the generalization that XP movement checking features against some head H° will always end up representationally in the specifier of HP. Note that our derivational construal of the Checking Relation does not in itself force phrases to move to the representationally defined specifier of a head against which the phrase checks features. Local economy, in the form of Minimize Sisters, does that. The fact that Spec-Head configurations obtain in Checking Relations is simply a representational by-product of the interaction of Derivational Sisterhood and local economy. We thus derive the significance of the Spec-Head configuration without recourse to stipulative representational definitions of "Checking Domain," "M-command," "Specifier of X," or any other representational definition of the significance of the "Spec-Head" relation to checking. All we need is: 1) C-command, which we argue to be a natural property of a derivational system; 2) some way of incorporating the idea of "mutual dependence," which is straightforwardly mutual C-command (= Derivational Sisterhood), and 3) an economy constraint that minimizes the computational complexity of each operation (= Minimize Sisters).
118
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
4.4. Shortest Move Effects: Wh-Islands As we can see, a species of Shortest Move constraint results from local economy as expressed in (4). In fact, our version of local economy effectively collapses two kinds of Shortest Move constraints, paraphrased here for illustrative purposes: (9)
Closest Target A category X moving to check feature F must enter a Checking Relation with the closest head H° bearing/capable of checking feature F. (cf. Ferguson and Groat 1994, 1995, Kitahara 1994)
(10)
Closest Mover A head H° bearing a feature F may enter a Checking Relation only with the closest category X that bears feature F. (cf. Chomsky 1995)
In each of these formulations of Shortest Move, a notion of "closeness" is employed, generally defined, and hence stipulated, in terms of representational C-command. Our analysis does not employ any conceptually unmotivated representational definition of closeness but nonetheless obtains results quite similar to both of these formulations of "Shortest Move." Given the high degree of unclarity surrounding the issue of the deletion, erasure, or retention of checked features, we will not fully review previous analyses of the relevant phenomena here. But let us look at wh-islands as a local economy effect that works out well under our system, regardless of the assumptions made concerning the results of feature-checking. Consider the partial derivation given in (11): (11)
[CP C°+wh [IP who I° [vp fixedv what]]]
On the assumption that the wh-feature of C is strong in English, it must be checked overtly. Let us assume for now that it therefore must be checked cyclically and hence at the next step in the derivation. There are two possible moves that would entail a
Constraints on Structure-Building
119
Checking Relation: movement of what and movement of who. The movement of either category has the property of creating Derivational Sisterhood between the moved category and C° (and between the moved category and C')- But movement of what would additionally entail Derivational Sisterhood between what and I°, what and who, and what and V°. Thus movement of who is cheaper, by (4). We thus generate (12): (12) [CP who [ c' C° +wh [IP t(who) I° [VPfixedv what]]]]
This is the point at which different analyses of the outcome of feature-checking come into play. Following Chomsky (1994) and Ferguson and Groat (1994, 1995), we would now say that the whfeature of who has been checked, and further wh-movement of the category would violate Greed. This is the first scenario to consider. It then follows that the wh-feature of C° is retained, since, clearly, something must be present with wh-features that are interpreted semantically. Now consider further structure-building up to the point shown in (13): (13) [CPC°+wh [IP you wonder [CP who [ c' C° +wh [ IPt (who) I°[VPfixedv what]]]]]]
Now the matrix C° must be checked. By hypothesis, who now lacks wh-features. Thus the only candidate for checking is what. But what cannot move to the matrix Spec CP without violating Minimize Sisters: there is another move that still results in a Checking Relation but entails fewer new Derivational Sisterhood relations than movement to the matrix Spec CP—namely, merger of what with C', as in (14): (14) [cp C0+wh [IP you wonder [CP who [ c' what [ c' C° +wh [IP t(who) I° [VPfixedvt(what)]]]]]]
Thus only this move is allowed by local economy. But such movement is countercyclic and by hypothesis will crash. Thus the sentence in (15) is correctly underivable.
120
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
(15) * What did you wonder who fixed? We therefore derive Shortest Move effects of the kind stipulated in the definition of Move in (9). Now let's consider different assumptions concerning feature deletion. In an alternative analysis, following Chomsky (1995), we hypothesize that the wh-feature of who is retained upon checking, since it is by hypothesis an inherent, interpretable feature of the word who. This feature must therefore remain visible to LF interpretation (hence it must not be either deleted or erased upon checking). This leaves open the question of whether the wh-feature of C° is erased/deleted upon checking or not; it will turn out that with respect to the island phenomenon in question, the precise analysis doesn't matter. Consider again structure built up to the matrix C° in (16): (16) [CP C°+wh [IP you wonder [CP who [C' C°+wh [IP t(who) I° [VP fixedv what]]]]]]
At this point, then, both who and what bear wh-features that could check against the matrix C°. Clearly, moving who to the Spec of the matrix CP would entail fewer new Derivational Sisterhood relations than movement of what. Thus what may not be moved, by local economy, and the sentence in (15) is again underivable. But now it looks like who could move to the matrix Spec CP, checking the wh-feature of C°, yielding (17): (17) * Who do you wonder fixed what? Chomsky (1995) argues that this structure is uninterpretable, since the phonologically empty C° head lacks sufficient features for interpretation; the presence of if, whether, or some moved whphrase is required in C° or Spec CP. Thus while Minimize Sisters allows movement of who but not what in (16), the result is semantic "gibberish." To summarize, then, both "Closest Mover" and "Closest Target" effects follow from our natural construal of local economy, and basic island effects follow in turn, under at least two different independently motivated hypotheses concerning feature deletion.
Constraints on Structure-Building
121
For now we will leave open the question of equidistance, defined on representations, as developed in Chomsky (1993, 1994, 1995). In fact, we will have to do without it, since this derivational approach employs no notion of "distance" per se, and in particular no stipulative definitions on representations. Equidistance can arise only in the case that two possible remerger operations that could check a given feature F result in the same number of new derivational sister relations.3 3. Deriving the Binarity of Merge/Move Chomsky (1995) suggests that the binarity of Merge/Move—that is, the requirement that exactly two objects be concatenated (to use the terminology of chapter 1)—follows from the "minimal" characterization of UG that the approach adheres to. Merger of fewer than two objects is undefined, while merger of more than two is "less minimal" than the merger of just two objects. However, any sort of claim about how many objects may undergo merger is itself ultimately "nonminimal"; the claim remains stipulative. Also, is merger of fewer than two objects truly undefined? It would clearly be simpler to say nothing at all about how many terms may be concatenated by a single application of the rules Merge and Move. Let us say nothing at all, then, and see what happens. First, consider Merge applied to either zero or one term. We might then imagine representational outputs of Merge applied to either zero or one term, applied mechanically as in (18). (See Collins 1997 for related discussion.) (18) a. Merge()
b. Merge(H)
{
}
{head(H), {H}}
In (18a), the rule Merge has applied to no terms at all. The output of the rule is a set, as usual. It has no label, since there is no term from which it could inherit a label; the set of categories merged is the empty set, which becomes the only member of the output term. In (18b), there is a term H from which the output inherits a label— namely, the head of H—the set of categories merged is the set containing exactly all the inputs: just H itself.
122
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
But if, as argued previously, syntactic operations are tolerated only if they provide some instruction to PF computation/semantic interpretation, then both of these possibilities are clearly ruled out: (18a) receives neither a semantic nor a phonological interpretation, since no lexical items or objects constructed from them are mentioned in either the input or the output of the operation. However, (18a) does involve lexical features and their projection. Does the operation entail significant information important to PF computation? Perhaps the projection of a lexical item yields a maximal projection that receives a different PF interpretation than a zero-level category; a plausible case of this might be found in an analysis of prosody sensitive to projection level. But under bare theoretic assumptions about bar-levels (Chomsky 1995), an unprojected head is already both a minimal and a maximal projection; thus no projection of the category is needed to create a phrasal-level category. What about semantic interpretation? There appears to be no plausible interpretive difference between X and {X, {X}}, nor in fact any difference between minimal and maximal projections in and of themselves. Thus although both of these possibilities are formally consistent with Merge, they are not possible operations.4 What about merger of two or more terms? Recall our construal of local economy, which we derived as the unique and natural local economy metric that permits structure-building under this derivational approach: (19)
Local Economy Minimize the number of derivational sisters created by Merge/ Move.
Recall that Merge is allowed to incur a cost because it is required to satisfy some postsyntactic interpretive/computation requirement (including feature checking). As long as some requirement is met by the merger of two categories, such as the discharging of one of a verb's theta-roles, it follows straightforwardly that when we apply Merge we will merge at most two categories: merging any more would entail the creation of a greater number of derivational sisters—assuming that feature-checking, selection, and so forth, are binary relations.5 Thus binary merger is seen to result from the same local economy constraint that yields both Shortest
Constraints on Structure-Building
123
Move effects and the epiphenomenal Spec-Head configuration in Checking Relations. This represents a unification of principles that appear entirely unrelated and stipulative from a representational perspective. Such an approach does, however, leave open the possibility that certain interpretive or feature-checking requirements might require more than two categories to enter a relation, in order for those requirements to be satisfied. In such cases, syntactic binary branching would not be enforced by CHL.
4.6. Binary Merge, Trinary Branching, and the Elimination of Intermediate-Level Categories. In this section we advance a tentative proposal concerning the invisibility of intermediate-level categories, based on an observation made in chapter 1 concerning Chomsky's (1995) hypothesis that intermediate-level categories, so-called single-bar projections, are not available for syntactic computation. In chapter 1, we argued that if these categories are truly invisible, then under a representational definition of C-command, empirically incorrect C-command relations result, while under the derivational definition, X' invisibility can be maintained along with the empirically correct Ccommand relations. First, we recap the X' invisibility hypothesis, show the C-command dilemma, and review the approach suggested in chapter 1. Second, we argue that X' invisibility is not in fact always expected under Chomsky' s 1995 analysis of "light verb" vP constructions. Third, we make the proposal that a type of Merge operation exists in which no category is projected. 4.6.1. Chomsky's 1995b X' Invisibility Hypothesis Chomsky (1995) discusses the role of inclusiveness in the minimalist framework. The idea is a guideline for the construction of the theory of CHL, but is not a principle of grammar particular to any theory. Chomsky (1995:225) writes:
124
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
[A] natural condition is that outputs consist of nothing beyond properties of the lexicon (lexical features)—in other words, that the interface levels consist of nothing more than an arrangement of lexical features. To the extent that this is true, the language meets a condition of inclusiveness. We assume further that the principles of UG involve only elements that function at the interface levels; nothing else can be "seen" in the course of the computation... As it stands, Chomsky's derivational approach does not actually adhere to this further assumption. Certain features introduced by the lexicon—namely, uninterpretable formal features—are certainly "seen" in the course of computation insofar as they motivate (and require) checking operations that erase them from the representation entirely before the representation meets the LF interface. But these elements (uninterpretable features) do not function at the interface levels; that is why they are erased. We will see in chapter 6 that the level-less, derivational approach, as applied to interpretation, actually derives something quite akin to the "further assumption" described previously, that principles of UG involve only elements that function at the interface levels. Chomsky goes on to propose that intermediate-level categories, which we shall henceforth refer to as "bar-level" or "X'" (= "X-bar") categories, are "invisible at the interface and for computation" (Chomsky 1995:242-243).6 This is in accord with the principle of inclusiveness: if bar-level categories indeed invisible to the LF interface, then by inclusiveness we expect the syntax to be unable to operate on such categories. 4.6.2. X' Invisibility Revisited In chapter 1 we noted a simple, but potentially deep, problem with the assumption that bar—level categories are invisible at the LF interface under a representational definition of C-command. The problem is simply that intermediate-level categories are crucial in establishing the asymmetry of C-command between the specifier and complement of a head X°: specifiers C-command comple-
Constraints on Structure-Building
125
ments, but not vice versa—a crucial asymmetry that pervades every theory involving C-command of which we are aware. But under the standard "first branching node" definition of C-command, defined on phrase-structure representations, X' must at least be "visible enough" for it to be construed as the first branching node dominating the complement of X°, hence preventing the complement from C-commanding the specifier of X°. Thus either bar-level categories are not in fact "invisible" to computation and interpretation or something is wrong with the representational definition of Ccommand. We opted for the latter analysis. Under our derivational definition of C-command, the invisibility of bar-level categories is irrelevant to the issue of C-command: X C-commands Y if and only if X merges with a phrase-marker of which Y is a term. Since a phrase in Spec X° necessarily merges with a phrase-marker of which the complement of X° is a term, the specifier C-commands the complement; conversely, since the complement of a head does not merge with a phrase-marker of which the specifier is a term, the complement does not C-command the specifier. Hence the desired asymmetry in C-command is retained, even under the assumption that bar-level categories are entirely invisible at the LF interface. Interestingly, our analysis of the C-command asymmetry between complement and specifier does not depend on whether or not bar-level categories are visible at the interface; the asymmetry stands regardless, given the derivational definition of C-command. Let us now examine the issue of X" invisibility in the context of Chomsky's analysis of the structure of VP and Case-feature checking.
4.6.3. A Problem for X' Invisibility Chomsky (1995) proposes a revision of his earlier analysis of Case checking, in which projections of Agr play no role. Noting that Agr's features serve only to check Agreement and Case on the lexical categories V and DP/NP, and that they have no interpretation either at PF or in the semantic component, he argues that it seems wrong to hypothesize the existence of syntactic categories that play no role at either interface level.7
126
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
Thus while in Chomsky (1993) Agr is held to be the locus of Agreement-feature checking as well as Case checking when T or V has adjoined to Agr (see Jonas and Bobaljik 1993 and citations therein), Chomsky (1995) revises the analysis and proposes that T alone is the locus of Nominative Case and Agreement featurechecking, while a new category, v, is the locus of Accusative Case and Agreement checking. The category v takes VP as its complement, checks Accusative Case of its object in Spec vP, and hosts the subject of a transitive verb phrase. Thus before object raising has taken place, the structure of yP is as follows:
Now, the bar-level category v', by hypothesis, does not in itself receive an LF interpretation, as its presumably obligatory external theta-role is not expressed. The projection vP, however, includes the subject, and is thus interpretable. At the same time, we expect, under the hypothesis that bar-level categories are not syntactically active, that v' should be syntactically inert. VP and vP, on the other hand, being maximal categories, are not syntactically inert, since they are maximal projections. Presumably, this is also consistent with the interpretability of VP, though the issue is unclear. Thus the correlation between bar-level and interpretability plausibly holds in this case. Consider, now, overt object raising to a second specifier above the subject for Case checking, as has been hypothesized for some Germanic languages.8 The resulting structure is (21):
Constraints on Structure-Building
127
Notice that what once was a vP is now a v', an intermediate-level category. Thus by the X' invisibility hypothesis, it too should now be inert for syntactic operations. But we now lose the correlation between semantic interpretability and syntactic visibility: does the new maximal projection vP receive a semantic interpretation? Possibly yes, but note that the projection arises only insofar as the object raises to a specifier position to check features, a matter unrelated to any plausible semantic compositionality that could define an interpretation for it. Does the old vP, now a v', change in its interpretive status? It is difficult to see how, since it remains unaltered as far as LF is concerned; whatever interpretation it would have had without object raising it must still have. In fact, under Chomsky's approach, in a language without overt object raising only the formal features of the object would raise covertly, and they would adjoin to v°, not affecting the bar-level status of vP at all. Hence, the correlation between X' invisibility and semantic invisibility is potentially a problematic one, and this correlation appears difficult to maintain in light of the structure of vP under Chomsky's account. If we cannot motivate the invisibility of bar-
128
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
level categories by claiming that they are interpretively null, then it is perhaps a simpler assumption to claim that bar-level categories are not in fact invisible to syntactic operations and claim instead that any term is potentially the input to some syntactic operation.
4.6.4. A Problem for Minimize Sisters Another problem for the preceding vP structure arises under considerations of this very chapter: it appears that overt object shift in languages such as Icelandic does indeed raise the object to a position higher than the base position of the embedded subject (see Jonas 1995 and Ura 1996). But under the Minimize Sisters hypothesis of derivational local economy, such a derivation would be impossible. If the subject merges with vP before object raising, as shown in (20) and (21), raising of the object over the subject would result in Derivational Sisterhood with the object. But this is a sisterhood relation that can be avoided through countercyclic merger of the moved object with the v', as can be seen in (22):
Constraints on Structure-Building
129
Thus the syntax allows only merger with v', since this minimizes the number of derivational sisters created by movement. But this operation is countercyclic, and the derivation crashes (see chapter 5 on the LCA and strict cyclicity). Clearly X' invisibility could come in handy here, as we could hypothesize that remerger of the moving object to v' is not a possible syntactic operation; thus movement to the higher Spec vP would indeed be the operation entailing the minimal number of sisters in checking the object's Case-features. But it would be preferable to stipulate nothing about the visibility of terms in a phrasemarker.
4.6.5. A Possible Solution: Nonprojecting Merger A possible solution to these problems might be found if we assume that the operation Merge does not necessarily project. Let's look at the standard definition of Merge as it applies to two input terms. In the "syntactic workspace," a new term is defined and the old terms are removed, as in (23). The new term, by definition, contains terms equivalent to the terms that were its input. (23) a.
Before:
XP = {X, {X°, Y}} ZP= {Z, {Z°,Q}
b.
Merge (X°, YP) (projecting X)
c.
After:
XP = {X, {X', ZP}} where X' = XP before merger
Both the original XP, now an X', and YP exist as terms after the operation. The null hypothesis is that as terms they are available to syntactic computation. But suppose that Merge does not necessarily project a new category but, rather, simply adds one of its two inputs to the set of terms in the other. For example, in (24), instead of projecting a new term with label X, we add ZP to the old XP:
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
130
(24)
a.
Before:
XP = {X, {X°, Y}} ZP = {Z, {Z°,Q}}
b. c.
Merge (X°, YP) (not projecting X) After: XP = {X, {ZP, X°, Y}}
Note that while in (23) the input XP is extant in the output, with the new bar-level status of X'. But in (24) the old XP does not exist as a term at all. This new operation—let us call it "nonprojecting Merge"—is not more "complex" than "projecting" Merge. In both cases, a new term is defined, and the old ones are removed from the workspace as separate phrase-markers. Both cases are also similarly asymmetric: just as projecting Merge projects the head of one of its input terms (X in (23)), nonprojecting Merge "inserts" one of its terms into the projection of the other (ZP gets into XP in (24)). Note that the result of this binary operation is a ternary branching structure. With respect to the C-command relation, we find that the representational and derivational construals of the relation yield different predictions for such a structure. Under a representational definition of C-command, all of the terms that are sisters in the output of nonprojecting Merge should C-command each other. This could be quite problematic, especially in light of a theory of the linear ordering of terminals such as Kayne's (1994) LCA, or Chomsky's (1994) bare theory reformulation of the LCA, that depends on asymmetric C-command for the computation of the temporal order of lexical items, or in fact for any theory in which a specifier must asymmetrically C-command the complement of a head H. But under a derivational approach to C-command, the problem does not arise, for the same reason explicated in chapter 1: in the derivation in (24), ZP is merged with a term containing YP and X°, thus it C-commands them, but YP and X° are never themselves merged with a category containing ZP, and thus neither Ccommands ZP. In fact, the C-command relations that obtain in the outputs of (23) and (24) are the same, with one exception: (23c) contains the term X'={X, {X°, Y}}, which C-commands ZP (since it was merged with ZP). We see now that the matter of the syntactic inertness of at least some bar-level categories might be accounted for: the category that
Constraints on Structure-Building
131
would be X' under projecting Merge would cease to exist under nonprojecting Merge. It therefore follows that the category is henceforth inert, as it is no longer a term in a phrase-marker. An issue arises with respect to the syntactic relations derivationally defined by the operation Merge on the input term that disappears: by the derivational definition developed in chapter 1 of C-command, the "disappearing X"' of (24) does indeed Ccommand the specifier ZP of the new XP, since at the point of its merger it was indeed a maximal, and hence visible, projection. That the term disappears from the syntactic workspace does guarantee its syntactic inertness at any point thereafter is clear: but at the point of Merge, it does enter into syntactic relations. Whether or not the C-command relations established for the disappearing X' "matter" to the syntax or postsyntactic computation/interpretation we leave as an open question, but we can examine here some of the immediate consequences of each possibility. Let us begin by assuming that the "disappearing X'" does enter into a C-command relation with ZP, C-commanding it and the terms it dominates. In other words, a term in the input but not the output of Merge does enter syntactic relations by virtue of that operation. A problem arises with respect to the LCA, of the same nature as that noted in Kayne (1994) and Chomsky (1995): since X' and ZP C-command each other, the relation is not asymmetric, and there is no ordering of terminals/lexical items dominated by ZP and X', which we have analyzed thus far as inducing a crashing derivation. Kayne resolves this problem by assuming that specifier positions are in fact adjoined positions and that the X' level does not exist; it corresponds to the lower segment of XP, which, by itself, cannot participate in C-command relations. Chomsky (1994, 1995) solves this problem through the hypothesis of X' invisibility: since the term X' is invisible, it does not participate in C-command relations. For us, the problem would remain without some additional stipulation. However, if the disappearing X' does not participate in Ccommand relations when a category (such as ZP in (24)) "merges into" it via nonprojecting merger—that is, terms that are in the input but not the output of Merge do not enter syntactic relations after the application of that operation—then this problem is solved: the only new C-command relations entailed
[32
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
by merger of ZP in (24b) are the asymmetric C-command relations between ZP and the terms of XP that are in the input and the output of Merge, namely, everything the disappearing X' dominates. Thus again informally following Kayne (1994) and Chomsky (1994, 1995), the terminals of ZP precede the terminals of the disappearing X'.9 In either case, this account rests on the derivational construal of C-command. Since the relation is not defined on the resulting structure, the form of the resulting structure is irrelevent to the relation, which is instead defined on the structure-building operation. Another interesting consequence of the hypothesis that "disappearing" terms under nonprojecting merger do not enter into C-command relations comes to light when we consider the Minimize Sisters requirement. Let us consider two possible derivations in which a phrase YP is merged with another phrase XP. In the first derivation, projecting Merge is chosen such that XP projects; in the second case, nonprojecting Merge is chosen such that YP "enters" XP. (25) a. Projecting Merge:
b. Nonprojecting Merge:
Constraints on Structure-Building
133
There is one potential difference in derivational C-command relations between the two derivations. In the first derivation, (25a), there exists a term X' that C-commands YP and is C-commanded by YP—that is, they are derivational sisters. In the second derivation, (25b), we encounter again the question of whether the disappearing phrase that is the input to Merge participates in C-command relations. If so, then it, too, must be counted as a sister to YP, even though it disappears from the resulting output. But if the disappearing phrase takes its C-command relations along with it, so to speak, then YP is a sister to nothing in the second derivation.10 Thus the second operation is invariably cheaper and must be chosen under the Minimize Sisters approach to local economy.11 If this analysis is correct, the prediction is that there are no intermediate-level categories in phrase-markers. This would then explain their syntactic inertness with respect to movement without stipulating their "invisibility" or trying to deduce their invisibility from interface considerations, which as we have seen is a problematic matter.
4.7, Summary In this chapter we examined the matter of local economy conditions as they might be applied to the derivational approach to syntactic relations. We argued that since Derivational Sisterhood is the "odd man out" in creating symmetric relations that are computationally complex, it is expected that the establishment of the relation is subject to a local economy constraint, Minimize Sisters. Such a constraint has effects that capture basic relativized minimality effects and, furthermore, ensures binary branching in the general case. Finally we examined the possibility of "nonprojecting Merge" as a possible solution to an apparent problem of X' invisibility. The problem is solved by saying that Merge need not always establish a new projection, but may reconstitute an old one without further projection; the term X' then disappears from the phrase-marker and is thus no longer available for computation.
134
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
4.8. Appendix: "Irrelevant" C-Command Relations and an Argument for "Minimize Sisters" It is perhaps worth speculating somewhat along the lines of Chomsky (class lectures, 1995). He suggests that under minimalist assumptions we expect language design to minimize computational complexity. From this idea we can develop a further conceptual argument to be made on behalf of this view of local economy. Under any definition of C-command, representational or derivational, and in any given phrase-marker, there will exist a number of C-command relations that are "superfluous," in the sense that certain pairs of the form "X C-commands Y" receive no particular interpretation: for example, a matrix subject might well C-command (representationally or derivationally) a preposition embedded in CP complement clause in a DP adjunct embedded three clauses down in the tree. Such a relation, though it is defined, would not appear to be syntactically significant, insofar as the relation is used neither for PF or LF interpretation, nor for syntactic feature checking.12 However, as is easily seen, cyclic structure building automatically entails the establishment of C-command relations between categories between which one would not expect any relation to be required; it is "a fact of life" under our construal of C-command (and in fact under a representational definition of C-command as well). The question then arises: "Have we in fact isolated the correct relation between categories?" If C-command is well-defined between categories in a tree which we have no reason to believe are in any significant syntactic relation with each other, have we correctly understood the relation? We believe that we have correctly understood the relation and that the superfluity of relations established is in fact a correct characterization of the syntactic mechanism that creates them (concatenation—that is, Merge and Move). The argument here rests on the assumption that Shortest Move and binary branching are empirically well-motivated properties of the syntactic component. By assuming that the automatic creation of potentially superfluous relations is also a real property of the syntax, we may gain insight into why these two (now unified) properties hold.
Constraints on Structure-Building
135
Note first of all that given the requirement of cyclic structure building, superfluous C-command relations will ensue, as in the example mentioned previously. But cyclic structure-building leaves no room for avoiding these superfluous relations; for Merge/ Move to be cyclic, the target must be the head of the root node of any given phrase-marker, resulting in potentially superfluous Ccommand relations. But for an operation of feature checking via movement, the relation in question is not C-command itself, but mutual C-command—that is, Derivational Sisterhood. Hence choices may arise in the course of a derivation: first, the question of which category should move to check a feature F of the target head and, second, the question of where a category with a given feature F should move to check that feature. For feature-checking, the required relation is Derivational Sisterhood. The effect of the Minimize Sisters constraint is to minimize the number of syntactically insignificant sisterhood relations that obtain as a result of some movement; in other words, it is precisely the point at which superfluous relations may be avoided, since by moving the "closest" category "as close as possible" to a target head, we minimize the number of non-featurechecking sisters created. Since any pair of categories that are derivational sisters may potentially check each other's features, each pair of sisters must be "examined" by the syntax for the possibility of feature-checking, even if no features in fact hold. "Minimize Sisters" is thus a means by which syntactic computation avoids creating potentially significant syntactic relations (sisterhood) between categories of which no local relation need hold. In other words, the insignificant relations are real to the syntax, which therefore avoids them via Minimize Sisters. The picture emerges whereby the simplicity of concatenative structure-building yields more relations than are needed by the syntax, and this superfluity of relations is curtailed where possible (i.e.,where there is a choice) by "Minimize Sisters." Economy of Derivation can be seen as a way of reducing computational complexity in a system in which the simplest possible operation that suffices for structure-building (Merge/Move—that is, Concatenate) provides an overabundance of potentially significant local intercategorial relations.
136
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
It is difficult to see how Shortest Move effects follow from any properties of a representational system providing structural definitions of syntactically significant relations (such as the Spec-Head configuration). Under a theory incorporating the notion of Checking Domain, a phrase that has moved to check features by moving to the specifier of some head H is in a significant local relation only with the head H which was the target of movement; the fact that the phrase does not move into the specifier of some higher head G is explained by stipulation (no relation to the target head would be formed), and the fact that it is the "closest" of several competing categories that moves to that specifier is again accommodated by pure stipulation. In contrast, under our approach, both phenomena make sense to the extent that computational complexity inherent to the derivational system is thereby avoided. Hence, it is no surprise that a local economy constraint like "Minimize Sisters" is built into the grammar: to avoid computational complexity.
Notes 1. See chapter 4. 2. Greed is essentially a subcondition of a more general constraint that syntactic operations are allowed only if they receive an interpretation of some kind by postsyntactic PF computation/semantic interpretation. If an unchecked feature is not a legitimate object with respect to either PF computation (in the case of strong features) or semantic interpretation (all uninterpretable syntactic features, such as Case), then a Merge/Remerge operation is licit precisely to the extent that the otherwise illegitimate feature is deleted or erased. See chapter 5 for discussion of this issue. It thus may be the case that (42) as formulated is incorrect: Fox (1995a, 1995b), in an analysis of VP ellipsis, has argued that Quantifier Raising is a licit syntactic operation precisely to the extent that it entails a distinct semantic interpretation, and not because feature-checking is involved; that is, if QR does not entail a distinct interpretation, then it is disallowed. If this is correct, then the establishment of a Checking Relation is not a necessary condition on remerger, but it is a sufficient one. As long as remerger is motivated by interpretive requirements outside the syntax, it is a licit syntactic operation. In fact, if Move is to be taken as nothing more than Remerge, then we do not expect any condition on the operation that does not apply to Merge. See the discussion in chapter 6, section 3.
Constraints on Structure-Building
137
3. Though a thorough investigation of this issue goes beyond the scope of this book, some interesting cases to consider might be cases of optional movement— in particular, the cases of quotative inversion and Locative inversion as anaylzed by Collins (1997). 4. We shall see in the next section that a hitherto unexplored extension of the formalization of Merge, in which terms containing more than two terms are created, becomes a possibility; thus we should not a priori rule out extensions of the formalization, but rather give arguments that they are or are not motivated extensions. 5. There exists the possibility that some requirement might entail the merger of more than two categories for its satisfaction; multiple feature-checking (see esp. Ura 1996) could well be an example of this. See also Collins (1995). 6. See also Chomsky (1995:fn 24) and references cited therein; see also Fukui (1995), and Brody (1996). 7. Note, again, an apparent inconsistency here: individual features that play no role outside the syntax, such as the uninterpretable Case-feature of T or DP, are hypothesized to exist—and they do play a role in syntactic derivations. Agr differs from these categories only in having no interpretable features at all. Chomsky's claim, then, might be better recast as a minimal assumption not about the syntax but about the lexicon: it is not the case that lexical items bearing uninterpretable features are not admissible to UG, but that lexical items bearing nothing but uninterpretable features (such as Agr) are disallowed. Why this should be true remains mysterious. 8. Jonas (1995) argues that in Icelandic and Faeroese the object crosses over the subject for Case checking. See also Ura (1996) for a discussion of whether object shift precedes subject merger or vice versa. Ura argues that both derivations are possible and subject to language-specific variation based on whether strong feature checking or violations of Procrastinate are involved (the former induces object raising before subject merger; the latter, subject merger before object raising). See also Chomsky (1995). 9. A closer look at the LCA follows in chapter 5; see also chapter 2, section 6. 10. If YP is undergoing "Remerge," perhaps moving to check features via Derivational Sisterhood with X°, then it does of course become a derivational sister to (at least) X°. The important point here is that it does not become a derivational sister of the disappearing XP. 11. Interestingly, the opposite would be the case if binary branching were a representational constraint (perhaps derived from a principle of representational economy such as "minimize branching"), since (25a) is binary branching, while (25b) is ternary branching, 12. The relation may be significant with respect to the LCA, following Kayne (1994) and chapter 5 of this book, in that all asymmetric C-command relations come into play when calculating the linear order of terminals in the structure (i.e., lexical items). However, Kayne's LCA imposes a strong restriction on the terminal-ordering reflex of asymmetrical C-command relations in a structure: a linear order of terminals must follow from the asymmetric C-command relations (under the mathematical definition of a linear order). This strong restriction serves not only to provide a link between structure and word order, but to restrict phrasestructure as well; Kayne derives several important properties of X' theory from this
138
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
strong restriction. However, Chomsky points out that the desired properties of phrase-structure Kayne derives may be derived from independently motivated notions of simplicity and conceptual necessity under the bare theory approach (Chomsky 1994,1995). Conceivably then, the notion of linear order might be too strong a restriction; it is not in fact needed to constrain phrase-structure, nor is it in fact needed to fully determine the order of terminals. Given three terminals a, b, and c, only two ordering relations are required to order them: "precedes (a, b)" and "precedes (b, c)," though a linear order would require "precedes (a, c)" as well. Thus the core idea that PF orders lexical items by virtue of syntactic information (be it encoded in a tree directly, or indirectly through C-command) provides us with very few guidelines as to what the correspondence might consist of, a priori. In particular, as is relevant here, it is not clear that long-distance C-command relations like the one described in the text are necessarily syntactically significant, even to the extent that an algorithm like the LCA is responsible for calculating precedence information.
5 The LCA, Cyclicity, Trace Theory, and the Head Parameter
Thus far we have examined the consequences of a strictly derivational account of syntactic relations in two domains: first with respect to semantic interpretation in chapter 2 and second within the syntactic component in chapters 3 and 4, in which we developed a theory of local relations required for feature-checking and a version of local economy that constrains syntactic operations. In this chapter we will take a closer look at one aspect of PF computation in the derivational framework, namely, linear ordering of terminals, along the lines of Kayne's (1994) LCA. We will see, as first suggested in chapter 2, that the derivational definition of C-command entails C-command relations different from those that follow from the representational definition, in the case of countercyclic Merge/Move. Countercyclic operations turn out to result in a set of derivational C-command relations that do not provide a full ordering of the lexical items in the phrase-marker (i.e., the C-command relations "underdetermine" the order). Therefore, countercyclic operations cannot be overt; we thus derive the cyclicity of overt operations without stipulating an Extension Condition (Chomsky 1993).1 We will also examine movement of categories, developing more fully the notion of "Remerge" suggested in chapters 3 and 4, that amounts to movement. We will see that countercyclicity is a property of pairs of operations (i.e., if one operation is "cyclic," then the other is not, and vice versa). 139
140
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
The LCA, as formulated, turns out to be problematic for movement as Remerge. We will propose a new formulation of the LCA, in which C-command proper, rather than asymmetric C-command between pairs of categories, induces precedence relations in the phonological component. Movement as Remerge then entails an "overdetermination" of the precedence relations among terminals. We propose an algorithm by which the PF component trims away the overdetermining relations, resulting consequently in "displacement" of the Remerged (=moved) category. A further consequence of our reformulation of the LCA is that the merger of a Head H with a complement Y yields exactly the same sort overdetermination of precedence relations. The two possible solutions to this overdetermination provide room for the head parameter. Unlike Kayne's (1994) and Chomsky's (1995) approaches to precedence, the formulation of the LCA which we propose makes possible head-final as well as head-initial precedence relations in the traditional sense. Finally, we examine more closely what it means under this derivational framework for an operation to be "overt" or "covert." Given that it is the derivational operation that encodes syntactic relations such as C-command, the PF component must have access to these derivational operations to compute precedence relations. Thus it makes no sense in this framework to say that a phrasemarker is "Spelled-Out" to PF computation; instead, there will be operations that receive a PF interpretation, and operations that do not. Those that do are "overt," while those which do not are "covert."
5.1. Kayne (1994) and Chomsky (1994, 1995) Kayne (1994) proposes an intimate relationship between asymmetric C-command and precedence relations, expressed by the LCA. Chomsky (1994,1995) modifies Kayne's original proposal, but for present purposes we adopt the relevant aspects of Chomsky's approach, mentioning any points we must revise.
The LCA, Cyclicity, Trace Theory and the Head Parameter
141
Under the bare-theoretic conception of phrase-structure, no ordering is established between merged or moved categories; a term that is not a head is a set containing two members: a label and an (unordered) set consisting of the terms merged. Sets encode no ordering among their members; thus the terms merged or moved are not syntactically ordered. But somewhere in the phonological component (or, in certain frameworks, at a point of "Spell-Out"), a temporal order must be established so that the phonological component can do its job. Under the derivational approach, there is no point of Spell-Out in the derivation, as we shall see, but for now let us retain the idea of Spell-Out.
5.2. Underdetermination of Phono-Temporal Order: The SCC and Overt Merge/Move What follows is an extension of the idea advanced in chapter 2, showing how the derivational construal of C-command interacts with the LCA to force overt, but not covert, Merge/Move to be cyclic. Consider the following tree:
Informally, the LCA (in the bare theory framework) states that each terminal contained in XP precedes each terminal that XP asymmetrically C-commands.
142
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
Under the derivational approach, the C-command relations that XP is in are encoded by the operation Merge (XP, YP) in the derivation.2 Thus the phonological component, or the rale "Spell-Out," must have access to the derivation, so that it can "read off the C-command relations that the derivation entails. With this in mind, consider the case of overt countercyclic merger of WP into the tree in (1). The full derivation is shown in (2): (2)
a. Merge (Y o,
ZP)
b. Merge (XP, YP) (YP becomes Y' under the relational definition of projection level)
The LCA, Cyclicity, Trace Theory and the Head Parameter
143
c. Merge (WP, Y') (Y' becomes Y'1)
As a property of merging XP and YP (where YP becomes Y') in (b), XP asymmetrically C-commands Y° and ZP; thus the terminals of XP will precede Y° and the terminals of ZP, by the LCA. As a property of merging WP in (c), WP will C-command everything XP does; thus the terminals of WP will also precede Y° and the terminals of ZP, by the LCA.3 But note that XP derivationally C-commands neither Y'2 nor WP. This is because XP was never merged with a category containing either WP or Y'2. Similarly, WP does not C-command XP, since it was not merged with a term containing XP. Thus, since no C-command relation holds between XP and WP, no precedence relation is established between the terminals of XP and the terminals of WP. Although both sets of terminals precede Y° and the terminals of ZP, the relative order of the terminals of XP and of WP is not established. This results in a violation of Full Interpretation at PF. The derivation therefore crashes. Note that this result would not obtain under a representational definition of C-command. Representationally, XP does C-com-
144
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
mand WP, and precedence relations between the terminals of XP and YP would be established. Furthermore, note that the countercyclic operation is ruled out only if the operation must receive an interpretation at PF. A covert operation, which receives no PF interpretation, is not constrained in this manner. We see that overt countercyclic structure-building causes a crashing derivation. Covert structure-building, on the other hand, is not subject to Full Interpretation at PF; thus countercyclic Merge/ Move should be possible. Putting the matter of adjunction aside, this is exactly the stipulation made in Chomsky (1993, 1994) as an exception to the Extension requirement: (3) a.
The Extension Requirement (Chomsky 1994:22) . . . GT and Move-a extend K to K*, which includes K as a proper part.
b.
Exception to the Extension Requirement (Chomsky 1994:24) . . . the Extension Requirement holds only for substitution in overt syntax. . .
The lack of cyclicity for covert movement will of course be central to any analysis of covert movement not targeting the root node of the phrase-marker. That this stipulated (noninclusive) exception to cyclicity in (3b) can be derived from the derivational construal of C-command lends strong support to the derivational approach.
5.3. The "Complexity" of Countercyclic Merge and Covert Feature Raising Chomsky (class lectures, 1995) suggests another way to rule out countercyclic operations, in fact across-the-board, whether overt or covert. First of all, in the approach taken in Chomsky (1995), covert movement is always simply feature raising to a head that attracts the feature. If this operation is construed as a sort of "embedding" of the raised feature into the head, thereby simply changing the internal
The LCA, Cyclicity, Trace Theory and the Head Parameter
145
feature-constitution of the X° category, then it can be construed as not countercyclic.4 As far as overt operations are concerned, Chomsky (1995:254) claims that countercyclic Merge/Move targeting some term T internal to a phrase-marker K is a more complex operation than cyclic Merge/Move targeting the root of a phrase-marker, since not only must a new term T* be defined as the output of Merge/Move (as with Merge/Move targeting the root), but the new term T* must be inserted into the position in K formerly occupied by the target T. Thus countercyclic Merge/Move not only creates a new term T* but must also redefine K to include T* as a term. If we can make do with only the less complex operation, the cyclic one, then the minimalist approach would have UG admit only this "simpler" operation (unless for empirical reasons we are forced to deviate from this minimal assumption). Thus a simpler model of UG arises if there is no overt (or covert) countercyclic Merge/Move operation; the only apparently countercyclic operation is LF featurechecking. However, as discussed in chapter 2 (section 2.2), the theory of feature raising opens up a number of problems that lead us to reject it (in particular, the problem of the significance of the specifier position in overt movement). We returned to a theory of category movement, as it allowed us to develop a theory of locality based solely on C-command: Derivational Sisterhood. Nonetheless, Chomsky's conceptual argument still holds. We are thus left with a problem: if countercyclic Merge/Move is invariably a more complex operation than cyclic Merge/Move, and hence not to be admitted as a fundamental operation in CHL, now can we accommodate covert countercyclic category raising?
5.4. Retaining "Simple" Merge and Covert Category Movement Note that countercyclic Merge/Move is only more complex than cyclic Merge/Move if the term T* that is the output of the operation
146
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
is "reinserted" into the phrase-marker K as a part of the operation. The argument that countercyclic Merge/Move is more complex is based on the fact that an extra step of "re-insertion" is required internal to the operation Merge/Move if the operation is countercyclic, while this step is not required for cyclic Merge/ Move. But it turns out that there is no need for such an assumption under the derivational approach. Let us simply say that Merge/ Move creates as its output a new term and does not insert it into any K, any more than cyclic Merge/Move does. For example, a countercyclic operation targeting X in (4a) yields a new phrasemarker B in (4b):
The LCA, Cyclicity, Trace Theory and the Head Parameter
147
If this operation is overt, we have the same C-command problem as in the previous example: A asymmetrically C-commands Q and R, but there is no derivational C-command relation between A and Z. Thus no ordering of the terminals of A and Z is possible. Furthermore, the existence of the new term B does not resolve the matter: it C-commands nothing, since it has not merged with anything, thus no additional C-command relations are available that facilitate a fully determined phono-temporal ordering of terminals, as the LCA requires. The situation with respect to Dominance here parallels that of C-command in (2c), in which ZP is C-commanded by XP and WP, neither of which C-commands the other. The same holds in (4): Q, for example, is C-commanded by A and Z, neither of which C-commands the other. Note that neither merger of Z and X forming Y nor merger of X and A forming B is derivationally ordered with respect to the other. For Y to be formed, only X and Z need exist; for B to be formed, only X and A need exist. Both operations "must follow" only the construction of X, but neither need follow the other.5 In a sense, then, neither operation is the countercyclic operation: countercyclicity simply entails two operations that are each unordered with respect to the other, but which both must follow some third operation (in (4b), the relatively countercyclic operations must follow the construction of X, on the existence of which they obviously depend). Thus merger of X and A is countercyclic with respect to merger of Z and X, and vice versa. The core idea here is that two derivational operations are syntactically ordered with respect to each other if and only if one of the operations is syntactically dependent on the other. An operation O2 will be dependent on operation O1 if and only if O2 operates on a term that requires O1 for its very existence: that is, O1 created a term necessary in the Structural Description of O2. In the prededing example, merger of Z and X (call this 02) is dependent on the merger of Q and R (call this O1), since X comes into being only by virtue of O1; without O1, O2 could not take place. In contrast, merger of Z and X (again, call this 02) does not depend on the merger of X and A (call this O1), since the merger of X and A is not necessitated for the merger of Z and X (O2). (In chapter 6 we explore this relation between operations more fully.)
148
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
As we saw, a derivation such as (4) is excluded if the two "relatively countercyclic" Merge operations must both receive a phonological interpretation, since merged lexical items are left unordered. But the LCA does not rule out such a derivation if one or the other of the "relatively countercyclic" operations is covert, receiving no phonological interpretation. Does admitting structures such as that shown in (4) pose any problems for covert operations? After all, we end up with a nontree; Q, R, and X are each dominated by both Y and B, neither of which dominates the other.6 But if it is not the tree that serves as the set of instructions to the semantic component but, rather, the derivational steps (as argued in chapter 2), then the question is simply whether the derivation of such a structure receives a (legitimate) interpretation by the semantic component. There is in fact at least one way that the derivation could receive a legitimate interpretation: if one of the two terms created countercyclically, either Y or B in (4b), receives no interpretation at all. One case of this would be if the semantic interpretation of the Merge operation does not involve composition of the semantic features of the inputs to Merge. In this case, the new term that is the output of Merge does not correspond to any semantic interpretation; thus it is irrelevant to the semantic component that it is in neither a Dominance nor a C-command relation with other terms. When would such a case arise? Movement—remerger for the checking of formal features—is one such case. An object DP, for example, moving covertly and countercyclically to Spec vP to check Case-features, is moving purely for the sake of deleting an uninterpretable Case-feature; the move is licensed by Greed not because of any compositional semantic need of either v or DP, but because an uninterpretable feature must be eliminated (see also chapter 4, section 6.3). Thus, although the operation Remerge entails the creation of a new category, this category by hypothesis does not correspond to any composed semantic object and receives no semantic interpretation that must be composed "into" the rest of the phrase-marker.7 In light of this, it would appear that we can make do exclusively with the "simple" conception of Merge, in which no categories are ever "re-inserted" into a phrase-marker. Nonetheless, we may allow "simple" Merge to apply countercyclically.8
The LCA, Cyclicity, Trace Theory and the Head Parameter
149
Countercyclic Merge then will be possible under two conditions: (1) that it not be overt, and (2) that it not be licensed for compositional semantic reasons.9 In short, we derive the generalization that countercyclic movement will be allowed for covert featurechecking, while at the same time preserving the minimal assumption that only "simple" Merge is a rule of UG. Thus we get essentially the same results as that obtained by Chomsky (1995), without hypothesizing either covert feature raising or "complex" Merge/Move.
5.5. Movement as Remerger: A Problem for the LCA The idea that Move is nothing but the merger of a category already merged, licensed by the need to check features of some head, is itself a straightforward one, and we have been adopting it in lieu of trace theory thus far without comment and without apparent problems. But under such an approach, there is no notion of trace theory, nor in particular a copy theory of traces: no "copies" are created by remerger, which simply adds a new set of Dominance and Ccommand relations to the derivation. A problem then arises with respect to movement. Let us first restrict our discussion of the movement of complex categories. Given the framework we developed in chapter 2 concerning the nature of local relations, a category XP checks features against a head H° only if XP and H° derivationally C-command each other (Derivational Sisterhood). But this means that, as a result of the checking operation, H° does not asymmetrically C-command XP (though it asymmetrically C-commands the terms dominated by XP).
(5) a. [H° ... [...XP...]] before Remerge: H° asymmetrically C-commands XP and all terms dominated by XP. LCA: H° precedes terminals of XP.
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
150 (5) b. [XP
[H° ... [... XP . . . ] ] ]
after Remerge: H° and XP derivationally C-command each other; H° asymmetrically C-commands terms dominated by XP. LCA: H° precedes terminals of XP.
In short, the LCA requires antisymmetry, while checking under Derivational Sisterhood requires symmetry. If only antisymmetric C-command relations induce a precedence relation among terminals, following Kayne (1994) and Chomsky (1994), then only the asymmetric C-command relation between H° and the lexical items dominated by XP will be phono-temporally ordered, and H° will precede all of the terminals of XP even offer movement of XP. Thus syntactic movement of a complex category does not yield any change in the phono-temporal order of terminals! Now consider the case where the moved category is both minimal and maximal—for example, a pronoun undergoing object shift to Spec vP for Case checking. Again, checking will obtain under Derivational Sisterhood between v° and the pronoun. But this means that v° and the pronoun C-command each other; thus no asymmetric C-command. Furthermore, v° does not asymmetrically C-command any lexical items within the pronoun, since there aren't any. Thus the pronoun and y° are not in any precedence relation. The derivation should crash for the same reason that overt countercyclic operations cause a crash. Thus something has gone awry: we desire that remerger for feature-checking, in the case of strong features, does change the precedence relation between the lexical items in the moving category and other lexical items in the phrase-marker and that movement of pronouns is possible. 5.6. A Solution: Distinguishing Over- from Underdetermination of Precedence One solution to this problem can be found by reconsidering the relation required for establishing precedence.10 For both Kayne and Chomsky, asymmetric C-command is the relation that is consid-
The LCA, Cyclicity, Trace Theory and the Head Parameter
151
ered: if category X aysmmetrically C-commands category Y, then the terminals of X precede the terminals of Y. In the theory of local relations developed in chapter 3, however, asymmetric C-command has no special status (while symmetric C-command does: it constitutes a local relation). Let us continue to assume that asymmetric C-command has no special status, in which case we will not want to use it in the computation of precedence relations. Instead, let us take C-command, pure and simple, to be the relation that induces a precedence relation among terminals. In doing so, two different scenarios emerge which we can use to distinguish the LCA problem induced by countercyclic operations from that induced by Remerge. Let us take (6) to be our revised LCA: (6)
Linear Correspondence Axiom (Revised) If X C-commands Y, then the terminals in X precede the terminals in Y.
The cases of countercyclic Merge discussed in section 5.2 are still ruled out in the same manner as before, since no C-command relation between the countercyclically merged category and "higher" categories obtains through countercyclic Merge, thus some of the terminals are unordered with respect to others. The C-command relations that do obtain underdetermine the precedence relation of the terminals. But the situation is different with respect to the case of Move/ Remerge. A problem still remains, but it is now of a different sort. In this case, the C-command relations overdetermine the precedence relation among terminals in a manner that yields contradictory, as opposed to nonexistent, precedence relations. For example, consider (7), in which a category XP has moved/remerged "across" a head H° to check features (under Derivational Sisterhood). (7) [ H P XP H°[ K ...XP]] H° C-commands XP by virtue of merging with a phrase-marker K of which XP is a term. Thus H° precedes the terminals in XP. But after remerger of XP, XP C-commands H°, thus the terminals of XP
152
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
should precede H°: a contradiction. Worse yet, XP also C-commands itself and all of its own terminals by virtue of remerging; thus the terminals of XP should precede themselves: again a contradiction. Crucially, the shift from asymmetric C-command to C-command as the relation relevant to precedence has two possible consequences crucial here: (1) if C-command does not obtain, then lexical items will not be ordered at all; (2) if symmetric C-command obtains, then the ordering of some terminals leads to contradictions in phono-temporal order. Now, if C-command serves as the sole instruction to the phonological component to order terminals, then it is natural that the absence of such instructions should result in a crashing derivation, as in the case of countercyclic operations, since insufficient ordering instructions are provided by such derivations. But a set of contradictory instructions differs in that instructions are provided. In fact, it is precisely the new set of C-command relations, created via remerger, that causes the problem and induces the contradiction (i.e., before Remerge applied, there was no problem). We can imagine, then, that PF has some means of dealing with "new" instructions that contradict "old" ones concerning precedence. Intuitively we would like PF to ignore some subset of the C-command relations it receives—in particular, to ignore the smallest subset that would resolve the contradiction. In other words, a sort of repair strategy for dealing with "too many" instructions. Bearing in mind the fact that it is the symmetric C-command relations that appear to be causing the problem, we propose the following Precedence Resolution Principle:
(8) The Precedence Resolution Principle (PRP) If two (not necessarily distinct) categories symmetrically C-command each other by virtue of some syntactic operation O, ignore all C-command relations of one of the categories to the terms of the other with respect to establishing precedence via the LCA.
The LCA, Cyclicity, Trace Theory and the Head Parameter
153
Let us consider again a case like (7), in which XP is remerged: (9) [ HP XP H°[ K -XP]] Remerger of XP Note that H° and XP (two distinct categories) derivationally Ccommand each other. Thus by the PRP, PF computation "weeds out" some of the ensuing command relations. One possibility is for PF to ignore all C-command relations from H° to the terms of XP. The only C-command relation left between H° and XP and its terms is C-command of H° by XP. By the revised LCA, the terminals of XP thus precede the terminals of H°, without contradiction. Note also that XP symmetrically C-commands itself, XP, upon remerger (the case of nondistinct symmetric C-command admitted in the PRP). Thus the C-command relations between XP and the terms of XP are ignored, freeing PF computation from the phonotemporal contradiction of terminals "preceding themselves." But now note that a second possibility exists: all C-command relations from XP to H° could be ignored. If this option is chosen, then the C-command relations visible to the revised LCA will include "H° C-commands XP (and its terms)"—but not "XP C-commands H°." Thus H° should precede the terminals of XP. In this case, no movement appears to PF to have taken place; that is, PF would interpret the mover in situ. We must rule out this possibility, as the whole point of overt movement is, well, overt movement. One possibility is to say that a strong feature is effectively an instruction to the PF component to rearrange its precedence relations (perhaps to obtain adjacency between the head bearing the strong feature and the checking category11). A Remerge operation that is interpreted by the PF component but results in no change in the output of PF is thus ruled out.12 Nunes (1994, 1995) suggests that under a copy theory of traces, the two copies bear different features: loosely, the lower copy still bears unchecked features and is thus eliminated at PF by a principle of Economy of Representation. Such an analysis is not consistent with our proposal that no copies
154
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
are involved in movement, since for us movement is remerger of the same category, not a copy of that category, to another position. The problem of why the higher copy, and not the lower copy, of a moved category is pronounced can be seen as a particular instantiation of one of the deepest problems confronting syntactic theory, namely, why does human language involve displacement of categories from positions in which they are interpreted? Our approach certainly does not solve the problem, but, like Nunes' proposal, situates the problem in a way that might prove fruitful for better understanding. To summarize, a Merge/Move operation that provides too few instructions to PF to phono-temporally order the lexical items causes the derivation to crash, as in the case of countercyclic movement. A Merge/Move operation that provides contradictory instructions, on the other hand, does at least provide information in the form of C-command relations that can be interpreted, but it provides too many. The PRP then trims down the C-command relations to the point where they do not constitute contradictory instructions. In this way we solve the problem of Move qua Remerge with respect to the revised LCA.
5.7. Computational Complexity and Minimize Sisters At this point it is worth noting that, in general, a moving category will become a derivational sister not just with the head against which it checks features, but with other intervening categories as well. For each of these categories, the PRP will have to apply; the more intervening categories there are, the more applications of the PRP will be required to establish noncontradictory precedence relations—that is, more computation is necessary. As in chapter 4, it is perhaps again worth speculating somewhat along the lines of Chomsky (class lectures, 1995). He suggests that under minimalist assumptions, we expect language design to minimize computational complexity. Perhaps, then, the economy principle of "Minimize Sisters" can be seen as an aspect of UG that minimizes the computational complexity of Merge/
The LCA, Cyclicity, Trace Theory and the Head Parameter
155
Move operations in the PF component of the grammar, since it guarantees that the smallest possible number of Derivational Sisterhood relations obtain in the satisfaction of some locality requirement such as checking and thus guarantees minimal "revision" in the PF component of precedence relations already established. Chapter 4 advanced a conceptual argument addressing the question of why Minimize Sisters is a particularly natural economy constraint in the derivational framework. We perhaps shed some light on why the constraint should exist: Derivational Sisterhood entails the possibility of a Checking Relation, and hence of computational complexity; Minimize Sisters then reduces computational complexity, in accordance with the idea that CHL is an optimal realization of interface requirements. Now that we have developed a theory of movement under which Minimize Sisters entails minimal PF computational complexity, this can be taken to be an additional sign that the design of language does indeed tend to minimize computational complexity. 13
5.8. The Head Parameter The PRP proposed in (8) is applicable whenever two categories Ccommand each other, resulting in "overdetermination" of phonotemporal order by the LCA. This is the situation that inevitably arises under a Derivational Sisterhood theory of Checking Relations. But there is another case where symmetric C-command arises, as we saw in chapter 3: merger of a head H° and a complement XP. By virtue of merging with each other, H° and XP are immediately derivational sisters: H° and XP C-command each other, and the PRP applies. As in the case of movement, it can apply in one of two ways: either 1) the C-command relations of the terms of XP by H° are ignored for computation of precedence, or 2) the C-command relations of the terms of H° (i.e., just H°) by XP are ignored. These two possibilities correspond exactly to the two possible settings of the head parameter: either H° precedes the lexical items in XP or the lexical items in XP precede H°. Note also that since
] 56
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
no movement is involved, there is no question of "changing" precedence relations, but rather of establishing them in the first place; whatever principle entails displacement—that is, a change in precedence relations, in the case of movement to check a strong feature, will not apply here. Some other principle is at work. This is presumably the head parameter. Upon merger of a projecting head H with a nonprojecting complement, forming a phrase HP, a language in which C-command relations by the projecting head are ignored by PF will be head-final, while a language in which C-command relations by the nonprojecting complement are ignored will be head-initial. The setting of this parameter might be acrossthe-board for all heads (as is likely to be the case in strongly headfinal languages like Japanese, Korean and Turkish) or might be a head-specific parameter (as is suggested by the word order of many Germanic languages, in which CP and TP are apparently head-initial, while VP appears head-final). Note that this account significantly differs from the LCA of both Kayne (1994) and Chomsky (1995) in its predictions concerning word order. Following these approaches, no traditionally headfinal structures are possible at all: for a head and its complement to be LCA-consistent, asymmetric C-command must hold between the head and the minimal projection(s) of its complement. But asymmetric C-command will entail precedence; thus all languages are underlyingly head-initial. This rather strong claim is difficult to reconcile with the facts of apparently head-final languages (see Kural 1997, Koizumi 1995). Groat (1994) argues that the core facts of Turkish verbal morphology can be accounted for under a headfinal approach in which functional categories are spelled out as clitics; the one case of auxiliary verb raising that arises appears to be an exact parallel to cases of German auxiliary raising in subordinate clauses, a parallel that is implausible under the assumption that Turkish is underlyingly head-initial. Kural (1997) directly critiques an LCA-consistent analysis of Turkish phrase-structure on the basis of scope relations between post verbal constituents. It is interesting to note that the head parameter falls out neatly from a general procedure for establishing precedence relations, the PRP, intended to resolve the precedence contradictions created by
The LCA, Cydicity, Trace Theory and the Head Parameter
157
movement construed as Remerge. We saw that Minimize Sisters, an economy metric that minimizes computational complexity in the PF component in the case of movement, has the side effect of in general enforcing the binarity of Merge even when no movement is involved. Likewise, the PRP, a "repair strategy" for dealing with the results of movement operations, has the side effect of leaving subject to parametric variation the order between head and complement, even when no movement is involved. In both cases, apparently unrelated phenomena are linked, both deriving from the same underlying principles: Minimize Sisters and the PRP.
5.9. A Note on the Nonexistence of Spell-Out At this point it should be clear that the notion of Spell-Out as a rule applying at some point in a derivation is apparently inconsistent with the strictly derivational approach we advance. To the extent that PF receives instructions concerning the linear order of terminals, these instructions take the form of derivational steps and the C-command relations they entail. As C-command is a property not of phrase-markers but of derivations, the rule Spell-Out applied to some phrase-marker cannot easily be made applicable. The situation is exactly parallel to the case of semantic interpretation, as in chapter 2: interpretation proceeds derivationally. Likewise, phonological "interpretation" also proceeds derivationally: each step of the derivation, rather than a particular phrase-marker, serves as an instruction to the PF component. This is not to suggest that phonological computation does not make use of a representational system itself. The interpretation of a set of derivational steps might well be used to build up intermediate phonological representations, which are then transformationally altered into a form that meets the requirements of the PF interface. But a point worth stressing is that we have no way of defining a "covert" component of the syntax, all of whose operations apply "after" the overt component—that is, after the application of the "Spell-Out" rule. Instead, we simply say that an op-
1 58
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
eration O in a derivation D either receives or does not receive a phonological interpretation. If it does, then it is an overt operation; if it does not, then it is a covert operation.14
5.10. Summary In this chapter we looked at an important empirical difference between the representational and derivational construals of Ccommand with respect to countercyclic operations in the overt syntax. We found that under the derivational, but not the representational construal, countercyclic operations do not correctly order lexical items under a theory of precedence such as Kayne's LCA. Furthermore, we proposed a somewhat different version of the LCA which accommodates the conflicting C-command relations that ensue upon remerger—that is, movement of a category, without reference to the copy theory of traces, or any other theory of syntactic traces, for that matter. We then showed that the "repair strategy" hypothesized leaves open the possibility of the head parameter, which can be seen as a subcase of the procedure that resolves the ordering paradox, namely, the PRP, needed to accommodate movement. Notes 1. An analysis of this kind was first published by Kawashima and Kitahara (1995) and proposed independently by Groat (1995b, 1997). 2. Recall from chapter 4 that the preceding structure may not be correct, depending on whether we adopt the possibility of nonprojecting Merge; if we do adopt nonprojecting Merge, then there is no Y' level in the structure at all. 3. It could be argued that WP could not merge countercyclically with Y' in this example, since Y', being an intermediate-level projection, is invisible to syntactic operations; see also note 2 concerning the chapter 4 analysis in which the term Y' does not even exist under nonprojecting Merge. The issue at hand is, however, more general: if we take XP to be instead a head X that projects, then Y' would be instead a YP, a visible term with which merger would be possible. 4. This idea is problematic: if feature raising is a syntactic and not a morphological operation, then the internal structure of the lexical item to which a raised feature moves is by definition a syntactic structure, as it is visible to the syntactic operation Attract-F, participates in Checking Relations, and is the well-formed output of a
The LCA, Cydicity, Trace Theory and the Head Parameter
159
syntactic structure-building rule. Since the "syntactic" structure of a head in a phrase-marker is not the root node of the phrase-marker, a syntactic operation that operates on a head is by definition countercyclic. But let us assume for now that Chomsky's approach is tenable. 5. In the terminology of chapter 6, the two operations do not stand in a "must follow" relation. 6. This is true under either a derivational or representational definition of Dominance; see chapter 6 for a derivational construal of the Dominance relation. 7. In fact, it seems quite plausible to imagine a Merge operation that does not create any output at all: instead of a triplet <X, Y, Z> meaning "Merge X and Y, forming Z = {H(X or Y) ,{X, Y}}," we have <X, Y> meaning "Merge X and Y with no output." This is perhaps the correct formulation of Merge which does not semantically compose, under the rubric of inclusiveness (see discussion in chapter 3); minimally, we desire not to introduce symbols, such as a new projection, that do not correspond to any interpretation. 8. As discussed in the text above, "Simple" Merge means merger does not, in the countercyclic case, insert its output term into a phrase-marker of which one of its inputs was a term. 9. In other words, given two Merge/Move operations that are countercyclic relative to each other, one of them must be both covert and noncompositional. Interestingly, this does not restrict covert Merge/Move to feature-checking: Quantifier Raising, in which only (noncompositional) scope relations are established, can well be covert and countercyclic. 10. In independent work, Nunes (1994, 1995) has suggested an approach not unlike the analysis that follows: traces of categories, sharing elements of the Numeration with their nondistinct copies, create an LCA-inconsistent configuration that must be "repaired" at PFby a deletion strategy, generally deleting the trace copy and not the higher copy. His work differs, however, in assuming a copy theory of traces under a representational approach to Spell-Out. We cannot adequately review his approach here; the reader is referred to the works in the bibliography. 11. See Bobaljik (1995) for an account of the role of adjacency in feature checking. The issue becomes more difficult in light of the vP hypothesis and multiple specifiers. Many languages have been analyzed as involving object shift to a second specifier of yP. Such object raising would "cross over" an intervening subject in an inner Spec vP. But the subject would block adjacency between the raised object and v°. See Chomsky (1995) and Ura (1996); see also chapter 3. 12. See also the discussion in Chomsky (1995:294). He suggests an economy principle of the form in (i) (Chomsky's (76)): (i)
a enters the Numeration only if it has an effect on output.
If we extend this economy constraint to features of a as well as lexical items themselves, then a strong feature could only enter the Numeration if it had an effect on the output. If, furthermore, we hypothesize that a strong feature has no semantic interpretation and hence no effect on semantic interpretation, it must have some effect on PF output. Then, if the second option in (9) is chosen such that the precedence relation between H° and XP is not changed, there will be no change in PF output, in violation of (i).
160
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
13. Of course, even greater minimization of computational complexity in the PF component could be achieved if there were no movement at all. 14. Cf. Groat and O' Neil's (1995) hypothesis that all structure-building is cyclic and overt, with Spell-Out applying to the LF interface phrase-marker. Whether or not a Move operation (e.g., cyclic object shift in English) displaces phonological features is then the deciding factor concerning whether a given operation is "overt" or "covert." Brody (1995,1996) advances a related, but representational, approach, without a derivational movement rule, in which "overt" movement corresponds to pronunciation of a category at the head of its chain, while "covert'' movement corresponds to pronunciation of a category at the tail of its chain.
6 On Derivationality
There is no being, only becoming. —Friedrich Nietzsche
In this chapter we take a closer look at a number of fundamental issues that arise in the context of the derivational framework we are proposing. First, we begin with a second look at the derivational construal of C-command and provide a formalization of the relation that overcomes a possible objection to the derivational construal analysis given in chapter 1. Second, we recap the implications of our approach for trace theory and the distinction between Merge and Move; in the final analysis, traces under this approach are epiphenomenal, and Merge and Move can in fact be unified (see also Kitahara 1994, 1995) as a single pairing operation. Third, we step back from the analysis a bit and compare the architecture of such a system with that of a representational system, suggesting that an argument in favor of the derivational architecture can be found in the natural way it simultaneously accommodates both symmetrical and asymmetrical relations.
6.1. A More Detailed Analysis of Derivational C-command Chapter 1 introduced the derivational approach to C-command, which formed the basis of all of the analyses presented in subsequent chapters. One objection to the logic of the analysis in chapter 1 runs as follows. According to that analysis, the syntactic relation 161
162
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
C-command is a natural relation, found by a close examination of the operation Merge (or any instance of concatenation). By construing what we mean by Merge (X, Y) to be "put X and Y into a relation," C-command should fall out naturally. However, this doesn't quite seem the case, as there are two other scenarios that appear equally natural, if not more natural, to Merge (X, Y) taken alone. Consider the following logic, a paraphrase of the argument in chapter 1: when X merges with Y, X gets into a relation since it has been merged. With what? Naturally, with Y. What is Y? A collection of terms. Thus X gets into a relation with a bunch of terms, and this corresponds to C-command. So far so good. But now consider the following extension to this argument: What is X? It is another (perhaps singular) collection of terms. Therefore, this second collection of terms of X gets into a relation with the first collection of terms of Y. Now it appears that every term in a tree T1 should enter a relation with every term in the tree T2 that T1 is paired with, quite clearly nothing like C-command. Why should it be that the "relator" X is construed as simply the term X (and crucially not the terms internal to X), while the "relatees" of X (c-comandees) of X are construed as each of the terms of Y? That is, when X is merged with Y, X itself, but not the terms in X, C-commands all the terms of Y, not just Y itself.1 What appears to be at issue here is what we mean by X and by Y when we talk about syntactic relations created by their merger. The two possible construals of X and Y are to regard each of them as either a single term (i.e., X or Y itself), or to regard each as a collection of terms (i.e., the terms of X and of Y) Preferably we would construe X and Y consistently throughout. If X and Y are construed as single terms throughout, then merger of X and Y is expected to result in a syntactic relation only between the single terms X and Y, but this would be simple sisterhood, not C-command. On the other hand, if we construe both X and Y as collections of terms, as suggested in the preceding paragraph, then merger of X and Y is expected to establish relations between all the terms in both X and Y. But such a relation does not bear any resemblance to C-command either—it resembles, rather, the representational definition "A does not dominate B," A and B being terms of X and Y, respectively.
On Derivationality
163
Crucially, it appears that we require the third "mixed" case, where the "relator" (say X) is construed as a single term, while the relatees are a collection of terms (say all the terms of Y). Why does this nonidentical construal of X (as the term X itself) versus Y (as the collection of the terms of Y), seem to be the syntactically important one—that is, the one that looks like C-command? Perhaps the asymmetry stems from some inherent asymmetry internal to the operation Merge (X, Y). Its only asymmetric property, however, is the choice of projection: either X or Y projects. It is difficult to see how this asymmetry could get us out of the dilemma. To summarize, it appears that C-command is not perfectly natural to the operation Merge by itself. However, there is another asymmetry in the system of Generalized Transformations not internal to Merge itself: the order of rule-application. What we shall argue here is that we must construe C-command not simply as a property of Merge, but of Merge in the context of a derivation. In other words, C-command is neither a natural property of a representation nor a natural property of Merge in itself; rather, it is a natural property of a derivation. In the following three subsections we will in effect be formalizing the First Law approach given in chapter 1, section 4, an approach that sought to deduce the derivational definition of C-command: "X C-commands all and only the terms of the category Y with which X was paired/concatenated by Merge or Move in the course of the derivation." Consider the following observation concerning a derivation: a derivation consists of a "partly ordered" set (in fact, a Quasi Order2) of Merge/Move operations: some operations follow other operations by necessity. In any given derivation we find that in every case it is the term that merges after some other term that gets into the desired asymmetric relation to the other term, while those terms that merge simultaneously with other terms get into the desired symmetric relation with other terms. In the following sections we will reformulate the derivational definition of C-command in these terms. We begin by providing a more formal look at the "structure" of a derivation under cyclic structure-building.3 We can then show that the asymmetric relation Dominance falls out naturally as the set of new relations entered into by the output of any structure-building rule. We will then show
164
A DERIVATIONAL APPROACH TO SYN TACTIC RELATIONS
that, in contrast, C-command falls out naturally as the set of new relations entered into by the input to any structure-building rule. As input and output together constitute all of the objects involved in structure-building, it follows that Dominance and C-command together should in principle account for all of the relations entered into via structure-building.
6.1.1. The Structure of Cyclic Derivations Restricting our attention for now to cyclic structure-building, consider the derivation in (1) of acategory G=[G [E A B] [FC D]]. As a shorthand/notational convenience, we define the triple <X, Y, Z> to stand for the rule-application "Merge X and Y, forming Z." Thus the first two elements of a triple are the input to Merge, while the third element is the output of Merge. For consistency, the notation <X> indicates the operation Select, which selects a token of the lexical item X from the Numeration (or lexicon; see Chomsky 1995:226; Collins 1996).
(1)
We begin with lexical items A, B, C and D drawn from the lexicon; that is, they are in a "computational workspace."4 The first line of the derivation, containing only lexical items, thus stands for four operations: Select A, Select B, Select C, and Select D. Obviously, to form E and F, the lexical items A, B, C, and D must be available to computation; that is, they must have been Se-
On Derivationality
165
lected from the lexicon (or Numeration; see above references) and placed in the computational workspace before the Merge operations forming E and F can apply. Thus the Merge operation must follow Select A and Select B, and the operation must follow Select C and Select D. We use arrows here to denote this necessary relation. Note that the selection of A, B, C and D are not ordered in this way; they can all have been selected in any order, or simultaneously. Their relative order of selection is irrelevant to structurebuilding. Thus the most minimal description of the derivation includes no information on any "incidental" order. Similarly, and are not derivationally ordered, since no aspect of structure-building requires one operation to take place before the other takes place. Hence, they do not stand in a "must follow" relation. Finally, we see that the operation <E, F, G> must follow both and , since E and F do not exist until after these two operations have been performed. Note that the arrows in (1), indicating which operations must follow which, denote necessary relations between operations in the derivation. The arrows do not in fact encode the entire set of "must follow" relations: it is clear that the operation <E, F, G> "must follow" , , and so forth; thus the "must follow" relation is in fact the transitive closure of the relation indicated by the arrows. For clarity of illustration, we have chosen not to clutter the diagram with implicit arrows.5 The asymmetry of the "must follow" relation is at the core of the asymmetry of the Dominance and C-command relations, as we shall see. Thus the arrows denote the natural and inherent relation between the syntactic operations Select and Merge under derivational structure-building: i.e., they determine a quasi-order on the set of operations. But this is not the sort of relation that we ultimately want: we are looking for relations between terms, such as C-command—that is, a relation on a set of terms. Instead of simply stipulating relations between terms, let us make use of the relation that is inherent to the derivational system: the "must follow" relation. What we shall do is map this relation on operations into a relation on the terms in those operations. It will in fact be easier to begin by considering the Dominance relation and then move on to C-command.
166
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
6.1.2. A Derivational Definition of "Dominance" First let us look at the output of the operations in the derivation of (1), repeated as (2). Recall that the third element of each triple is to be construed as the output of Merge (X, Y); the output of Select in the top line of the derivation is simply a token of the lexical item selected.
The four outputs of the Select operations in the first line are simply tokens of the lexical items selected. The two outputs of the operations on the second line are E = [A B] and F = [C D], The single output of the operation on the third line is G = [[A B] [C D]].6 Let us look first at the operation , whose output is F. In our computational workspace, a change has taken place: F has appeared. What syntactic relations do we expect F to enter into? Since by "syntactic relation" we mean a relation between F and other terms, we are asking what other terms F gets into a relation with. What else is there? As far as F is concerned, only C and D. The parallel construction of E and selection of A and B are irrelevant to the derivation of F; that is, following chapter 1, they don't exist. In other words, the categories C and D are "new" to F; F is in no relation to them until the derivational step is applied (in fact, trivially so, since F did not exist before this operation was applied). Now what is the nature of the relation between C, D, and F? Optimally, we eschew some arbitrary relation defined on these categories, in favor of some relation that we can read off the deri-
On Derivationality
167
vation, one which is necessarily extant in the derivation—that is, the relation between the operations containing the terms in question. This relation is the irreflexive, (trivially) transitive, antisymmetric (and hence asymmetric) relation between and the two operations and . Thus we expect the relation between F and C and D to be identically irreflexive, transitive, and antisymmetric. Let us call the relation R. We then have R(F, C) and R(F, D). Interestingly, this amounts to the Dominance relation. As we can see, the same logic can be applied to the category G as the output of the operation <E, F, G>: G gets into a relation with those categories that are in operations (i.e., mentioned in the formal statement of the operation) with which <E, F, G> is in a relation—that is, A, B, C, D, E and F. Furthermore, the relation is minimally the same relation; it is similarly irreflexive, transitive and antisymmetric, and we have R(G, A), R(G, B), R(G, C), R(G, D), R(G, E), and R(G, F). Again, this corresponds exactly to a typical, irreflexive Dominance relation. Now A, B, C and D are also the outputs of an operation, namely, the operation Select, but none of the Select operations is in a "must follow" relation with any other operation; thus none of the outputs of those operations are in a relation with any other category. In other words, they do not dominate anything. We can now provide a formal definition of the Dominance relation, given a formal characterization of a derivation D. From the above discussion, it is clear that (i) we are considering the output of an operation, (ii) that the output enters into relations with categories that are "new" to it, and (iii) that the relation is isomorphic to the "must follow" relation among the operations containing the categories in question. Putting these together with a formalization of a derivation, we obtain the following: (3) a.
Definition of Derivation A derivation D is a pair , where 1. O is a set of operations {O1, 02, ...on} (Merge and Select) on a set S of lexical items in a Numeration, and terms formed by those operations,
168
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
and
ii.
b.
and and
M is a set of pairs of the form , meaning o1 "must follow" Oj. D is transitive, irreflexive and antisymmetric (a quasi order).
Definition of Dominance Given a derivation D = , let X, Ye S. Then X dominates Y iff i. X is the output of some o1e O, (we consider outputs) ii. X is not in a relation with Y (the relation is "new") in any proper subderivation D' of D, iii. Y is a member of some Oje O such that e M. (the terms are in a relation only if the operations are)
To paraphrase, a category X dominates a category Y precisely if X is the output of an operation (3b.i) that must follow an operation on Y (3b.iii). In other words, simply translating the quasi order inherent to the derivation (the relation between operations Inherent to cyclic structure-building) to the output of operations yields the quasi order of Dominance. Essentially, this is nothing more than a formalization of the intuition that a category dominates the categories of which it is constituted. Part ii of the definition in (3b) appears to be redundant here: we know that X couldn't possibly be in a relation with Y previous to o1, since X is by assumption the output of o1; it didn't exist before and thus couldn't have got into a relation by virtue of some subderivation. This part of the definition is meant to capture the idea that X gets into a relation, minimally, only with categories with which it was not in a relation before (see again chapter 1, section 4). We shall retain this part of the definition anyway, both to preserve the insight and to show (in the next section) how this definition is nearly identical to the definition of C-command. Following this logic, we can arrive at a similar definition of Ccommand. The analysis will be a close paraphrase of both the preceding analysis of Dominance and of the analysis of C-command in chapter 1.
On Derivationality
169
6.1.3. Revisiting the First Law of Syntax: C-Command The two structure-building operations we have looked at are Select and Merge; each has an input and an output. We have seen that Dominance is simply a mapping of the "must follow" relation M between operations in O to a relation between the output of those operations and other categories in O. If we consider now the categories that are the input to operations in O, which, if we are to conceive of intercategorial relations as properties of operations (rule-applications) in a derivation, is all there is left to consider. Let us reproduce our derivation once again in example (4):
Let us first examine the operation <E, F, G> as pertains to its inputs, F and E. Consider F: as in the analysis of Dominance, we ask what relations might obtain between F and other categories. What categories are "new" to F by virtue of this operation? Certainly A, B, and E, since before this operation was applied E was not in a tree with these categories; certainly not C, D and F, since F was already derivationally related to C and D by virtue of the operation . What, then, is the character of the relation between F and A, B, and E? Again we eschew an arbitrarily formulated relation in favor of a relation we can read off the derivation, one which is necessarily extant in the derivation—that is, the relation between the operations containing the terms in question. We see that the operation <E, F, G> "must follow" all of the operations involving terms new to F— a transitive, irreflexive, and antisymmetric relation. Thus it is natural to say that F gets into a relation with those categories that are in operations (i.e., mentioned in the formal statement of the
170
A
DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
operation) with which <E, F, G> is in a relation. Let us call the relation S, and we then have S(F, A), S(F, B), S(F, E): in other words, C-command. Does G get into the relation in question? No, it is not an input to the operation. Does A get into the relation in question? Yes, with F, since there is a relation between the operations mentioning A and F. Is this relation S(A, F) or S(F, A), or both? It is only S(F, A), since the relation between the operations mentioning F as input is in an antisymmetric relation with all relations mentioning A. We are now in a position to formalize the derivational definition of C-command, taking into account the same three factors looked at previously in the analysis of Dominance, with the only change being that the relation is defined on the input to a structurebuilding operation. (5)
and and
Definition of C-command Given a derivation D = , let X, Ye S. Then X C-commands Y iff i. X is the input of some o1e O, (we consider inputs) ii. X is not in a relation with Y (the relation is "new") in any proper subderivation D' of D, iii. Y is a member of some OjeO such that eM. (the terms are in a relation only if the operations are)
We can see that this definition is identical to the definition of Dominance with the exception of (5i): we are defining the relation entered into by the input to an operation. Note that unlike the derived Dominance relation, the antisymmetric character of the "must follow" relation does not carry over to C-command, though asymmetry is preserved. As we have seen, F C-commands E, but E also C-commands F, since E is (1) an input to <E, F, G>, (2) E neither C-commanded nor dominated F before <E, F, G>, and (3) the operation <E, F, G> is in the "must follow" relation with an operation containing F, namely, . Thus we have E C-commands F and F C-commands E; in general, "sisters" (a representationally defined notion) will C-command each other.7
On Derivationality
171
That there should be two differing intercategorial relations, one based on the input (Structural Description) of the transformational operation, and one based on the output (Structural Change) of a transformational operation, is entirely natural, since these are precisely the defining elements of a transformational rule. (In fact, the distinction precisely parallels a claim in the "First Law of Syntax" in chapter 1: two trees are never "disconnected" if they are merged together, even though it looks as if they were, directly before they merged. All this means, though, is that the two trees were not yet inputs to a structure-building rale.) In other words if syntactic operations create syntactic relations, the status of the relation (Dominance or C-command) is expected to parallel the status of the terms in the rule (input or output). In fact, the relations are identically defined, except that we begin in one case with the output of the operation (yielding Dominance), and in the other with the input (yielding C-command). From there we proceed identically: we look at all categories which are "new" to the input/output, and we take the relation to be identical to the relation M between the operations involving the categories in question (the "must follow" relation). Since the relation M is a defining property of derivation itself, the transitivity and, in the case of Dominance, antisymmetry and irreflexivity of the intercategorial relations receive a straightforward explanation. We thus address the problem posed at the beginning of this section. If X merges with Y, and therefore X C-commands Y and all its terms, why don't all the terms of X similarly C-command Y and all its terms? As we can now see, no answer to this question can be found by looking at the rule Merge by itself. What is necessary is to look at the operation in the context of a derivation, for it is in the derivation that we find the asymmetric relation that maps onto C-command and Dominance: the inevitable "must follow" relation. 6.2. On the Non-naturalness of the Representational Definition of C-command Interestingly, there has been independent work toward an explanation of C-command, that of Chametsky (1996). His analysis is an
172
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
excellent and intriguing approach to the question of C-command from a representational point of view and constitutes a serious challenge to the idea that the derivational approach yields the most natural account of C-command. Our counterargument will simply be that the account incorporates a particular notion, namely, that of a factorization of a phrase-marker, the inclusion of which serves only to define the C-command relation and is not a necessary or inherent property of a representation. In contrast, the derivational approach relies only on ordered transformational rule-application. Chametsky (1996) provides what is to our knowledge the first attempt at an explanatory account of C-command. Given the theory of phrase-structure he assumes, his account does indeed strongly suggest a very natural understanding of the relation. Chametsky's analysis is "representational." His claim is that given a phrase-structure representation, there is a simple and natural way to "read off C-command relations from the structure of the tree independently of whether the tree is recursively defined or derivationally constructed. His analysis assumes a theory of phrasestructure developed in chapter 1 of Chametsky (1996), a "Minimal Phrase Structure Theory" (henceforth MPST, as per Chametsky 1996), which is a very simple theory strongly resembling the bare theory of Chomsky (1995). For Chametsky, a phrase-marker is a set of nodes with one or more labels ordered by the Dominance relation (which he takes to be reflexive), but without a precedence relation. The details of and arguments for his proposal are compelling, but we will not reproduce them here as they are not important to the discussion at hand. Chametsky's conceptual move is to pose the question of Ccommand in the following way: instead of asking what set of categories a given category X C-commands and then seeking an account of this, we should ask what set of categories C-commands a given category X. His discovery is that for any node X in a tree, the set C of C-commanding nodes provides a minimal complete factorization of the phrase-marker with respect to node X. Consider the phrase-structure in example (6) from Chametsky (1996: 28)8:
On Derivationality
173
Let us consider the node G and ask what nodes C-command it. The set of C-commanding nodes is {B, F, E} under the following definition of C-command (Chametsky 1996:27): (7)
For any node A, the C-commanders of A are all the sisters of every node which dominates A.
It turns out that the set of nodes that C-command G constitutes a minimal (complete) factorization of the phrase-marker: there is no smaller set of nodes such that the entire phrase-marker is factored by the union of {G} and the set of C-commanding nodes. Informally, given the node G, the smallest set of nodes that could be "added up" with G (but do not redundantly contain G themselves) and thereby yield the full phrase-marker is the minimal factorization of the phrase-marker with respect to G. That the set of nodes to be considered provides the minimal factorization of the tree is natural under the following considerations: to begin with, it is a minimal and simple set to define, as opposed to some other, arbitrary set of nodes that provide a factorization. Equally simple, as Chametsky points out, would be the maxi-
174
A DERIVATIONAL APPROACH TO SYN TACTIC RELATIONS
mal factorization, as it, too, is unique: in the preceding example, the maximal factorization would be the union of {G} with all of the other pre-terminal nodes—that is, {B, F, H, J, K}—as these, too, "add up" with G to form the full phrase-marker. But, Chametsky argues, this set of nodes "denies the relevance of [primitive] Dominance relations not involving the target node [G] for all further relations [beyond Dominance] involving the target. Thus, of the two nonarbitrary sets [inducing the minimal and maximal factorizations], only the minimal set requires the full branching hierarchical structure—the full phrase-marker" (Chametsky 1996:31). The issue we raise concerning this latter argument is a small one, but perhaps significant: it is not clear in what way the maximal set does not require "the full branching hierarchical structure." The factorization of any phrase-marker requires the existence of a phrase-marker to factorize, in which all nodes are in a Dominance relation with (at least) the mother node of the representation. Thus to determine which factorization is maximal we do require the full branching hierarchical structure—that is, there is no way of ascertaining whether a given node X is in the maximal set unless we know whether or not X stands in a Dominance relation with nodes other than X itself.9 If it does, then some subset of the nodes it dominates will be in the maximal set, and X will not be. Thus the very notion of maximal factorization requires the same full branching hierarchical structure that the minimal factorization does. But a perhaps more compelling argument against this approach is that the notion of a factorization is needed to explain the naturalness of the representational definition of C-command, but to the best of our knowledge for nothing else. Though a factorization is easily defined, its definition serves only to facilitate a particular outlook on C-command; no notion of factorization is required independently. In contrast, the derivational definition given in (5) is based on the "must follow" relation that is a defining property of derivational structure-building and hence inherent to the computational system. 6.3. Trace Theory and the Unification of Merge and Move Two other issues that should be summarized are the status of trace theory and the related issue of whether Merge and Move remain
On Derivationality
175
distinguishable in the approach we have advanced. First, let us consider the status of traces, which will be seen to differ greatly from the Copy theory generally assumed in minimalst analyses. Minimalist analyses since Chomsky (1993) have generally made use of a copy theory of traces. In chapter 2 we reviewed analyses of reconstruction and binding effects which utilized the existence of full category copies at the LF level. But the analysis we ultimately proposed makes no reference to trace copies with respect to semantic interpretation: what corresponds to "reconstruction" of a category in a trace position is recast as derivational interpretation of that category at that point in the derivation at which the category merges/moves to that position. Similarly, the analysis of movement as Remerge in our approach to the LCA and PF computation does not refer to traces. A category that undergoes remerger provides contradictory instructions to PF in the form of symmetrical C-command relations. Now, under a trace theory, it is one of the two copies of the category that must be deleted at PF. But under our theory, it is one of the two sets of C-command relations that is deleted at PF (by the PRP of chapter 5). Finally, the issue of the copy theory of traces raises a puzzle concerning the featural status of the trace copy. If a category XP (or, under Attract-F, a bundle [FF] of formal features) moves to check a feature F that subsequently deletes, does the trace copy left behind by movement still bear an unchecked feature? If not, why not? If so, does it, too, need to undergo checking?10 Under our approach, there appears to be no need to invoke a mechanism of "copying" in order to move a term. Merge is defined as an operation on terms, and terms are defined (derivationally; see chapter 2). Therefore, nothing prevents a term from being merged more than once; this is simply Remerge. We thus abandon a syntactic theory of traces. Nonetheless, Remerge might be an operation distinct from Merge. But is there now any reason to suspect that it is? Movement has generally been stipulated to be constrained in particular ways that do not and/or cannot apply to Merge. The important distinguishing properties include: (1) Move is a Singulary Transformation, while Merge is Binary; (2) Move is constrained by Greed and is confined to feature-checking, while Merge does not result in fea-
176
A DERIVATIONAL APPROACH TO SYNTACTIC RELATIONS
rare-checking; (3) Move is restricted by a C-command relation between the antecedent and its trace (either derivationally or representationally), while this is not applicable to Binary Transformations; (4) Move is constrained by (some version of) a Shortest Move constraint, which is not defined for Merge, and (5) Move creates a copy of a category, while Merge does not. However, these distinguishing properties have all disappeared under our framework. Regarding (1), we explored in chapter 3 an analysis in which we did not restrict movement to Singulary transformations but allowed it to apply "interarboreally" across phrasemarkers (cf. Bobaljik and Brown 1997). Regarding (2), since interpretation proceeds derivationally, any operation that receives an interpretation at either interface is a legitimate operation, be it Merge or Remerge; additionally, Merge may in fact be a Checking Relation, as seen in chapter 3, since Derivational Sisterhood obtains between a phrase and a head that are merged. Regarding (3), we saw that Derivational Sisterhood, a natural notion of locality, generally yields the result that a moving category C-commands the position from which it moved, since other movements do not result in a Checking Relation; the exception to this head-movement is precisely the desired exception. Regarding (4), the economy metric that constrains movement in our system is a general constraint on syntactic operations, Minimize Sisters, as developed in chapter 4. This constraint applies equally to all operations, Merge or Move, thus does not serve to distinguish them. Finally, regarding (5), it is clear that once we abandon the copy theory of traces, movement does not entail any copying mechanism, any more than Merge does. We are left with no characteristic unique to Move that must be stipulated as one of its properties, making it a rule distinct from Merge. Both Merge and Move are instances of pairing/concatenation, pure and simple. This is a desirable result, since in addition to some means of selecting elements from the lexicon via the rule Select, the syntax is now left with one and only one structure-building rule: Merge.
On Derivationality
[ 77
6.4. A Conceptual Argument for Derivationality Given the quasi-order of derivational steps in a system of Generalized Transformations, the reading-off of this partial order with respect to the input of these transformations gives us the asymmetric relation of derivational C-command, even though the singular structure-building rule Merge in itself appears symmetrical with respect to its inputs.11 Why should the grammar make use of derivational computation when representational recursive definitions seem to yield the same sorts of objects: terms in a "term-of' relation (or in a Dominance relation)? Consider the following argument: Chomsky's minimalist approach encourages us to view the syntax as an optimal realization of interface conditions. By hypothesis, semantic interpretation makes use of hierarchical structure, for which set-theoretic objects could in principle suffice. One can either define such objects recursively, hence representationally, or define such objects as being the output of rule-application, hence derivationally. A deep question is: "Why should we ascribe one or the other definitions to the syntax, when the same hierarchical structure can be defined either by recursive definition or by derivational rule-application?" The answer might come from other interface requirements. Let us consider the PF branch of the grammar. If we adopt even a weak version of the LCA, namely, that the syntax provides information regarding the precedence relations among categories or lexical items, then the syntax must provide an asymmetrical relation of some sort by which precedence relations (which are inherently asymmetric) might be computed. Now, the derivational approach to structure-building, unlike a representational definition of possible structures, entails a partially-ordered sequence of Merge/ Move operations. As we have seen, there is a precise correlation between the relative order of introduction of any two categories into a phrase-marker, and the C-command relation that obtains between them: symmetric if simultaneous, asymmetric if ordered, nonexistent if not ordered. We can say that C-command is simply a "reading off of the partial order inherent to a derivational system of hierarchical structure-building; that is, the asymmetry of
178
A DERIVATIONAL APPROACH TO SYN TACTIC RELATIONS
intercategorial relations is equivalent to the asymmetry of structure building. An asymmetric relation such as "X C-commands Z" is simply a restatement of the derivational property, "X was introduced into a phrase-marker after Z was introduced." Viewed in this way, the derivational nature of the syntactic component is indeed an optimal system. Insofar as the semantic component requires hierarchical information, set-theoretic objects satisfy this requirement. To the extent that the PF branch requires asymmetric relations between terms, the derivational definition of hierarchical phrase-structure provides the asymmetric relation of C-command (while the recursive representational definition, as argued previously, does not naturally provide any asymmetric relation, except by pure stipulation in the definition of C-command). Thus, while the requirements of the semantic component do not necessarily include an asymmetrical primitive relation not equivalent to Dominance (though it may make use of such a relation, were it to exist), the PF branch requires it, and the derivational system yields it as a consequence. "It," as it turns out, happens to be Ccommand, as it is the intercategorial relation we argue to be natural to the derivational system. Once C-command is a property of the system, it may of course be used by the semantic component To recapitulate, semantics minimally requires information regarding hierarchical structure, while the PF branch minimally requires asymmetric relations between terms that are not hierarchically organized relative to each other. If the syntax admits hierarchical structure by virtue of a representation (recursive) definition, then asymmetries between categories not hierarchically organized must be stipulated representationally, as in representational definitions of C-command. But if the syntax creates hierarchical structure derivationally, asymmetric relations between terms not in any hierarchical relation with each other may be read off the partial order of derivational steps. Thus the derivational nature of structurebuilding can be seen as an optimal realization of interface conditions that require hierarchy on the one hand and asymmetry on the other.
On Derivationality
179
Notes 1. Conversely, that Y and none of its terms are "relators," while all terms of X are Y's relatees. 2. A relation R on a set A is a quasi order if R is transitive, irreflexive, and antisymmetric. Taking the set A to be the set of rule-applications in a derivation and the relation R to be the relation "necessarily follows," we see that a derivation entails such a quasi order. The structure of a derivation is not a partial order, which is transitive, antisymmetric, and reflexive: since no rule-application can "follow itself," the relation between rule-applications is not reflexive. The definition of "partial order" used here is not identical to the definition used in Kayne (1994); we refer the reader to Stanat and McAllister (1977) for the definitions we use and which appear to be standard usage in computation theory and discrete mathematics. 3. By the "structure" of a derivation, I will mean the relation that must hold between applications of the structure-building rules in that derivation (namely, the "must follow" relation), as opposed to the structure of the set-theoretic objects (phrase-markers) that are the output of some structure-building rule-application. 4. We put aside a formal definition of "Numeration;" see Chomsky (1995); cf. Collins (1997( for an analysis which eliminates the need for a Numeration. 5. The actual "must-follow" relation would amount to transitive closure of the relation given by the arrows. 6. Again, recall that I use bracket notation in lieu of the bulkier Bare-theoretic notation at this point in the discussion, since labels are not an issue here; no precedence relation between the terms in an expression [X Y] is implied. Thus [X Y] is equivalent to the bare theoretic set {H(X), {X, Y}} or (H(Y), {X, Y}}. 7. Interestingly, C-command turns out to be reflexive under this definition. Consider the operation <E, F, G> in the derivation in (5). First, the term E is an input to the operation; second; E is not in a relation with E by virtue of any previous operation (i.e., in any subderivation); third, this operation stands in a must follow relation with , which defines E as a term. Thus E C-commands E by definition (8). Given the PRP of chapter 5, reflexive C-command does not pose a problem for the LCA, since the offending reflexive C-command relation may simply be ignored. With respect to semantic interpretation, no category is ever "dependent on itself for interpretation, hence reflexive C-command relations presumably receive no interpretation by the semantic component. That a pervasive syntactic relation receives no PF or LF interpretation is not necessarily problematic, since there certainly exist Dominance and C-command relations beyond reflexive Ccommand that though well defined are not interpretively or syntactically significant. See chapter 4, section 9 for discussion of precisely this point with respect to the Minimize Sisters economy constraint. 8. The apparent Dominance relation between the pre-terminals and terminals (= lexical items) is in fact illusory; for Chametsky there are non nonbranching nodes;
180
INTRODUCTION
the relation between a syntactic node and a lexical item is construed as an instantiation of the lexical item by a categorial node. 9. Chametsky assumes reflexive Dominance in his characterization of phrasemarkers. 10. Nunes (1994, 1995) suggests that it does indeed bear an unchecked feature; for this reason the lower (trace) copy is deleted by PF. But in the case of interpretable features, such as phi-features, no deletion occurs on the moving category, by hypothesis (Chomsky 1995). Thus there is no difference in the feature make-up of the higher and lower copies of the moving category 11. The rule is not entirely symmetrical, since by hypothesis, one of the categories merged projects, and the other does not. The asymmetry of Ccommand, however, appears entirely unconnected with this asymmetry.
Bibliography
Abe, Jun. 1993. "Binding Conditions and Scrambling without A/A' Distinction." Ph.D. dissertation, University of Connecticut, Storrs. Aoun, Joseph, and Audrey Yen-Hui Li. 1989. "Constituency and Scope." Linguistic Inquiry 20: 141-72. Aoun, Joseph, and Audrey Yen-Hui Li. 1993. The Syntax of Scope. MIT Press, Cambridge. Aoun, Joseph, and Dominique Sportiche. 1983. "On the Formal Theory of Government." Linguistic Review 2: 211-35. Barss, Andrew. 1986. "Chains and Anaphoric Dependencies." Ph.D. dissertation, MIT, Cambridge. Belletti, Adriana and Luigi Rizzi. 1988. "Psych-Verbs and Q-Theory." Natural Language and Linguistic Theory 6: 291-352. Berwick, Robert and Samuel D. Epstein. 1995. "On the Convergence of 'Minimalist' Syntax and Categorial Grammar." In Algebraic Methods in Language Processing 1995: Proceedings of the Twente Workshop on Language Technology 10, joint with the First Algebraic Methodology and Software Technology (AMAST) Workshop on Language Processing, ed. A. Nijholt, G. Scollo, and R. Steetskamp. Universiteit Twente, Enschede. Bobaljik, J. 1995. "Morphosyntax: The Syntax of Verbal Inflection." Ph.D. dissertation, MIT, Cambridge. Bobaljik, J. and S. Brown. 1997. "Interarboreal Operations: Head Movement and the Extension Requirement." Linguistic Inquiry 28: 345— 56. Brody, Michael. 1995. Lexico-Logical Form: A Radically Minimalist Theory. MIT Press, Cambridge. Chametsky, Robert A. 1996. A Theory of Phrase Markers and the Extended Base. State University of New York Press, Albany. Chomsky, Noam. 1973. "Conditions on Transformations." In A Festschrift for Morris Halle, ed. Stephen R. Anderson and Paul Kiparsky. Holt, Rinehart and Winston, New York, 232-86.
181
182
BIBLIOGRAPHY
Chomsky, Noam. 1976. "Conditions on Rules of Grammar." Linguistic Analysis 2: 303-61. Chomsky, Noam. 1981. Lectures on Government and Binding. Foris, Dordrecht. Chomsky, Noam. 1982. Some Concepts and Consequences of the Theory of Government and Binding. MIT Press, Cambridge. Chomsky, Noam. 1986a. Barriers. MIT Press, Cambridge. Chomsky, Noam. 1986b. Knowledge of Language: Its Nature, Origin, and Use. Praeger, New York. Chomsky, Noam. 1991. "Some Notes on Economy of Derivation and Representation." In Principles and Parameters in Comparative Grammar, ed. Robert Freidin. MIT Press, Cambridge, 417-54. Reprinted in The Minimalist Program, 1995, MIT Press, Cambridge, 129-66. Chomsky, Noam. 1993. "A Minimalist Program for Linguistic Theory." In The View from Building 20: Essays in Linguistics in Honor of Sylvain Bromberger, ed. Kenneth Hale and Samuel Jay Keyser. MIT Press, Cambridge, 1-52. Reprinted in The Minimalist Program, 1995, MIT Press, Cambridge, 167-217. Chomsky, Noam. 1994. "Bare Phrase Structure." MIT Occasional Papers in Linguistics 5, Department of Linguistics and Philosophy, MIT, Cambridge. Published in Evolution and Revolution in Linguistic Theory: Essays in Honor of Carlos Otero, ed. Hector Campos and Paula Kempchinsky. Georgetown University Press, Washington, D.C., 51-109. Also published in Government and Binding Theory and the Minimalist Program, 1995, 383-439, ed. Gert Webelhuth, Blackwell, Oxford, 383-439. Chomsky, Noam. 1995. "Categories and Transformations." In The Minimalist Program. MIT Press, Cambridge, 219-394. Chomsky, Noam and Howard Lasnik. 1993. "Principles and Parameters Theory." In Syntax: An International Handbook of Contemporary Research, ed. Joachim Jacobs, Arnim van Stechow, Wolfgang Sternefeld, and Theo Vannemann. Walter de Gruyter, Berlin. Reprinted in The Minimalist Program, 1995, MIT Press, Cambridge, 13-127. Cinque, Guglielmo. 1990. Types of A'-Dependencies. MIT Press, Cambridge. Collins, Chris. 1994. "Economy of Derivation and the Generalized Proper Binding Condition." Linguistic Inquiry 25:45-61 Collins, Chris. 1995. "Toward a Theory of Optimal Derivations." MIT Working Papers in Linguistics 27: Papers on Minimalist Syntax. Department of Linguistics and Philosophy, MIT, Cambridge, 65-103.
BIBLIOGRAPHY
183
Collins, Chris. 1996. Local Economy. MIT Press, Cambridge. Davis, Lori J. 1986. "Remarks on the Theta Criterion and Case." Linguistic Inquiry 17: 564-68. den Dikken, Marcel. 1995. "Binding, Expletives, and Levels." Linguistic Inquiry 26:347-54. Epstein, Samuel D. 1986. "The Local Binding Condition and LF Chains." Linguistic Inquiry 17: 187-205. Epstein, Samuel D. 1990. "Differentiation and Reduction in Syntactic Theory: A Case Study." Natural Language and Linguistic Theory 8: 313-23. Epstein, Samuel D. 1992. "Derivational Constraints on A'-Chain Formation." Linguistic Inquiry 23: 235-60. Epstein, Samuel D. 1993. "Superiority." Harvard Working Papers in Linguistics 3, ed. H. Thrainsson, S. D. Epstein, and S. Kuno. Department of Linguistics, Harvard University, Cambridge, Mass., 14-64. Epstein, Samuel D. 1994. "The Derivation of Syntactic Relations." Ms., Harvard University, Cambridge, Mass. Paper presented at the Harvard University Linguistics Department Forum in Synchronic Linguistic Theory, December 1994. Epstein, Samuel D. 1995. "Un-Principled Syntax and the Derivation of Syntatic Relations." Ms., Harvard University, Cambridge, Mass. Ferguson, K. Scott. 1993a. Generals Paper, Harvard Univeristy, Cambridge, Mass.. Ferguson, K. Scott. 1993b. "Notes on the Shortest Move Metric and Object Checking." Harvard Working Papers in Linguistics 3, ed. H., Thrainsson, S. D. Epstein, and S. Kuno. Department of Linguistics, Harvard University, Cambridge, Mass., 65-80. Ferguson, Scott K. 1994. "Deriving the Invisibility of PP nodes for Command from AGR0+P0 Case Checking." In Harvard Working Papers in Linguistics 4, ed. S. D. Epstein, H. Thrainsson, and S. Kuno. Department of Linguistics, Harvard University, Cambridge, Mass., 3036. Ferguson, K. Scott, and Erich Groat. 1994. "Defining Shortest Move." Paper presented at GLOW 17, Vienna. Ferguson, K. Scott, and Erich Groat. 1995. "Defining Shortest Move." ms. Harvard University, Cambridge, Mass. Fiengo, Robert. 1977. "On Trace Theory." Linguistic Inquiry 8: 35-61. Fox, Danny. 1995a. "Economy and Scope." Natural Language Semantics 3:283-341. Fox, Danny. 1995b. "Condition C and ACD." Papers on Minimalist Syntax, MITWPL 27. Department of Linguistics and Philosophy, MIT, Cambridge.
184
BIBLIOGRAPHY
Frampton, John. 1991. "Relativized Minimality: A Review." Linguistic Review 8: 1-46. Freidin, Robert. 1986. "Fundamental Issues in the Theory of Binding." In Studies in the Acquisition of Anaphora, ed. Barbara Lust, Reidel, Dordrecht, 1:151-188. Freidin, Robert. 1992. "Foundations of Generative Syntax." In Current Studies in Linguistics 21. MIT Press, Cambridge. Freidin, Robert. 1996. "Binding Theory on Minimalist Assumptions." Ms., Princeton University, N.J. Fukui, Naoki. 1995. "The Principles-and-Parameters Approach: A Comparative Syntax of English and Japanese." In Approaches to Language Typology, ed. M. Shibitani and T. Bynon. Oxford, Clarendon, 327-72. Groat, Erich. 1994. "Against Feature Deletion: A Bare Theory Argument." In Harvard Working Papers in Liguistics 4, ed. S. D. Epstein, H. Thrainsson, and S. Kuno. Department of Linguistics, Harvard University, Cambridge, Mass., 52-62. Groat, Erich. 1995a. "English Expletives: A Minimalist Approach." Linguistic Inquiry 26: 354-64. Groat, Erich. 1995b. "On the Redundancy of Syntactic Representations." Ms., Harvard University, Cambridge, Mass. Paper presented at GLOW 18, Troms0. Groat, Erich. 1997. "A Derivational Program for Syntactic Theory." Ph.D. dissertation, Harvard University, Cambridge, Mass. Groat, Erich and John O'Neil. 1995. "Spell-Out at the LF Interface." In Minimal Ideas: Syntactic Studies in the Minimalist Framework, ed. S. D. Epstein, H. Thrainsson, W. Abraham, and CJ.-W. Zwart. John Benjamin, Amsterdam, 113-39. Hale, Kenneth, and Samuel J. Keyser. 1993. "On Argument Structure and the Lexical Expression of Grammatical Relations." In The View from Building 20: Essays in Linguistics in Honor of Sylvain Bromberger, ed. Kenneth Hale and Samuel Jay Keyser. MIT Press, Cambridge, 53-110. Hasegawa, Hiroshi. 1996. "Adjoin vs. Merge, and the Concept of C-Command." English Linguistics 13: 15-39. Heycock, Caroline. 1995. "Asymmetries in Reconstruction." Linguistic Inquiry 26: 547-70. Holmberg, Anders. 1986. "Word Order and Syntactic Features in the Scandinavian Languages and English." Ph.D. dissertation, University of Stockholm, Stockholm. Hornstein, Norbert. 1994. "An Argument for Minimalism: The Case of Antecedent-Contained Deletion." Linguistic Inquiry 25: 455—80.
BIBLIOGRAPHY
185
Huang, C.-T. James. 1982. "Logical Relations in Chinese and the Theory of Grammar." Ph.D. dissertation, MIT, Cambridge. Jackendoff, Ray. 1972. Semantic Interpretation in Generative Grammar. MIT Press, Cambridge. Jackendoff, Ray. 1977. X' Syntax: A Study of Phrase Structure. MIT Press, Cambridge. Jackendoff, Ray. 1990. "On Larson's Treatment of the Double Object Construction." Linguistic Inquiry 21: 427—56. Johnson, Kyle. 1991. "Object Positions." Natural Language and Linguistic Theory 9: 577-636. Jonas, Dianne. 1995. "Clause Structure and Verb Syntax in Scandinavian and English." Ph.D. dissertation, Harvard University, Cambridge, Mass. Jonas, Dianne, and Jonathan. D. Bobaljik. 1993. "Specs for Subjects." In MIT Working Papers in Linguistics 18: Papers on Case and Agreement. MIT, Cambridge, 1: 59-98. Kawashima, Ruriko, and Hisatsugu Kitahara. 1995. "Strict Cyclicity, Linear Ordering, and Derivational C-Command." In Proceedings of the Fourteenth West Coast Conference on Formal Linguistics. CSLI Publications, Stanford, Calif, (distributed by Cambridge University Press), 255-69. Kayne, Richard. 1989. "Facets of Romance Past Participle Agreement." In Dialect Variation and the Theory of Grammar, ed. P. Beninca. Foris, Dordrecht, 85-103. Kayne, Richard. 1994. The Antisymmetry of Syntax. MIT Press, Cambridge. Kitahara, Hisatsugu. 1993. "Deducing Strict Cyclicity from Principles of Derivational Economy." Paper presented at GLOW 16, Lund, Sweden. Kitahara, Hisatsugu. 1994. "Target a: A Unified Theory of StructureBuilding." Ph.D. Dissertation, Harvard University, Cambridge, Mass. Kitahara, Hisatsugu. 1995. "Target a: Deducing Strict Cyclicity from Derivational Economy," Linguistic Inquiry, 26: 47-77. Kitahara, Hisatsugu. 1996a. "A Derivational Solution to Conflicting CCommand Relations." Ms., University of British Columbia, Vancouver. Kitahara, Hisatsugu. 1996b. "Raising Quantifiers without Quantifier Raising." In Minimal Ideas: Syntactic Studies in the Minimalist Framework, ed. Werner Abraham, Samuel D. Epstein, Hoskuldur Thrainsson, and C. Jan-Wouter Zwart. John Benjamin, Amsterdam, 189-98. Kitahara, Hisatsugu. 1997. Elementary Operations and Optimal Derivations. MIT Press, Cambridge.
186
BIBLIOGRAPHY
Koizumi, Masatoshi. 1995. "Phrase Structure in Minimalist Syntax." Ph.D. dissertation, MIT, Cambridge. Kuno, Susumu, and Ken-ichi Takami. 1993. Grammar and Discourse Principles: Functional Syntax and GB Theory. University of Chicago Press, Chicago. Kural, Murat. 1997. "Postverbal Constituents in Turkish and the Linear Correspondence Axiom," Linguistic Inquiry 28: 498-521. Larson, Richard. 1988. "On the Double Object Construction." Linguistic Inquiry 19: 335-91. Larson, Richard. 1990. "Double Objects Revisited: Reply to Jackendoff." Linguistic Inquiry 21: 589-632. Lasnik, Howard. 1972. "Analyses of Negation in English." Ph.D. dissertation, MIT, Cambridge. Lasnik, Howard. 1976. "Remarks on Coreference." Linguistic Analysis 2: 1-22. Lasnik, Howard. 1993. "Lectures on Minimalist Syntax." In MIT Occasional Papers in Linguistics 1, MITWPL. Department of Linguistics and Philosophy, MIT, Cambridge. Lasnik, Howard. 1995. "Last Resort." In Minimalism and Linguistic Theory, ed. Shosuke Haraguchi and Michio Funaki. Hituzi Shyobo, Tokyo, 1-32. Lasnik, Howard, and Mamoru Saito. 1984. "On the Nature of Proper Government." Linguistic Inquiry 15: 235-89. Lasnik, Howard, and Mamoru Saito. 1992. Move a; Conditions on Its Application and Output. MIT Press, Cambridge. Lebeaux, David. 1988. "Language Acquisition and the Form of Grammar." Ph.D. dissertation, University of Massachusetts, Amherst. Lebeaux, David. 1991. "Relative Clauses, Licensing, and the Nature of the Derivation." In Perspectives on Phrase Structure: Heads and Licensing, ed. Susan Rothstein. Academic Press, San Diego, 209-39. Lebeaux, David. 1995. "Where Does the Binding Theory Apply?" In University of Maryland Working Papers in Linguistics 3. Department of Linguistics, University of Maryland, College Park, 63-88. May, Robert. 1977. "The Grammar of Quantification." Ph.D. dissertation, MIT, Cambridge. May, Robert. 1985. Logical Form: Its Structure and Derivation. MIT Press, Cambridge, Mass. Muysken, Peter. 1982. "Parameterizing the Notion 'Head.'" Journal of Linguistic Research 2. Nash, L. K. 1963. The Nature of the Natural Sciences. Little, Brown, Boston/Toronto. Nunes, Jairo. 1994. "Linearization of Non-Trivial Chains at PF." In Uni-
BIBLIOGRAPHY
187
versity of Maryland Working Papers in Linguistics 2. Department of Linguistics, University of Maryland, College Park, 159-77. Nunes, Jairo. 1995. "The Copy Theory of Movement and Linearization of Chains in the Minimalist Program." Ph.D. dissertation, University of Maryland, College Park. Pesetsky, David. 1982. "Paths and Categories." Ph.D. dissertation, MIT, Cambridge. Pesetsky, David. 1995. Zero Syntax. MIT Press, Cambridge. Platzack, Christer. 1995. "Topicalization, Weak Pronouns and the Symmetrical/Asymmetrical Verb-Second Hypothesis." Ms., University of Lund, Sweden. Pollock, Jean-Yves. 1989. "Verb Movement, Universal Grammar, and the Structure of IP." Linguistic Inquiry 20: 265-342. Poole, Geoffrey. 1994. "Optional Movement in the Minimalist Program." In Minimal Ideas: Syntactic Studies in the Minimalist Framework, ed. S. D. Epstein, H. Thrainsson, W. Abraham, and C.J.-W. Zwart. John Benjamin, Amsterdam, 189-220. Poole, Geoffrey. 1995. "Uninterpretability vs. Non-Convergence: A Case Study." In review, Linguistic Inquiry, and ms., Harvard University, Cambridge, Mass. Poole, Geoffrey. 1996. "Transformations across Components." Ph.D. dissertation, Harvard University, Cambridge, Mass. Reinhart, Tanya. 1976. "The Syntactic Domain of Anaphora." Ph.D. dissertation, MIT, Cambridge. Reinhart, Tanya. 1979. "The Syntactic Domain for Semantic Rules." In Formal Semantics and Pragmatics, ed. F. Guenther and S. Schmidt. Reidel, Dordrecht. Reinhart, Tanya. 1995. "Interface Strategies." OTS Working Papers, Utrecht University. Riemsdijk, Henk van, and Edwin Williams. 1981. "NP-Structure." Linguistic Review 1: 171-218. Rizzi, Luigi. 1990. Relativized Minimality. MIT Press, Cambridge. Saito, Mamoru. 1989. "Scrambling as Semantically Vacuous A'-Movement." In Alternative Conceptions of Phrase Structure, ed. M. Baltin and A. Kroch. University of Chicago Press, Chicago, 182-200. Stanat and. McAllister. 1977. Discrete Mathematics in Computer Science. Prentice Hall, Englewood Cliffs, N.J. Thrainsson, Hoskuldur. 1993. "On the Structure of Infinitival Complements." In Harvard Working Papers in Linguistics 3, ed. H. Thrainsson, S. D. Epstein, and S. Kuno. Department of Linguistics, Harvard University, Cambridge, Mass., 181-213. Thrainsson, Hoskuldur. 1994. "On the (Non)-Universality of Functional
188
BIBLIOGRAPHY
Categories," in review, John Benjamin, Amsterdam. Toyoshima, Takashi. 1996. "Derivational CED: A Consequence of the Bottom-up Parallel Process of Merge and Attract." Ms., Cornell University. Ura, H. 1996. "Multiple Feature-Checking: A Theory of Grammatical Function Splitting." Ph.D. dissertation, MIT, Cambridge. Watanabe, Akira. 1995. "Conceptual Basis of Cyclicity." MIT Working Papers in Linguistics 27: Papers on Minimalist Syntax. Department of Linguistics and Philosophy, MIT, Cambridge, 269-91. Wyngaerd, Guido Vanden, and Jan-Wouter Zwart. 1991. "Reconstruction and Vehicle Change." In Linguistics in the Netherlands 1991: 15160. Yang, Charles D. 1996a. "Derivational Minimalism and Psychological Reality." Ms., MIT, Cambridge. Yang, Charles D. 1996b. "Minimal computation in the Minimalist Program." CUNY Sentence Processing Conference, City University of New York.
Index
merger, 111, 123-33 adjuncts, 50 transformational rule as, 3 Agreement-feature checking, 125, binding relations, 20, 41, 47-53, 126 109-10n.ll, 175 anaphor-movement, 48-50, 54-55, derivation of, 63-70 62, 66, 68, 72, 77-78nn.5,6, interpretive version of, 47-48, 62, 80n.30 84 antisymmetry, 150, 170 problems with single-level analysis Aoun, Joseph, 1 1 0 n . 3 of, 53-60, 62, 63, 70, 84 argument-chains, 53 violation, 39 arguments, introduction of, 50 Bobaljik, Jonathan D., 97, 101, 126, asymmetries binding relations and, 47-53 159n.11, 176 Bound-Variable Interpretation, C-command, 9-10, 33, 41, 74, 85, 56-58, 67-68, 80n.31 114, 124-25, 130, 137-38n.l2, 150-52, 156, 163, 165, 170, 171, branching, 27, 85, 134 Brody, Michael, 160n. 14 177-78 in Generalized Transformation, 163 Brown, S., 101, 176 reconstruction, 47-53, 63-66, 70, Case-feature checking, 70, 96-99, 175 125-27, 129, 148 Attract, 59, 86, 89 Category Raising, 86-89 C-command, 7, 22-23, 51, 53, 56, Bare Output Conditions, 5, 11, 12 57, 63, 69, 70, 73, 75, 84, 85, bar-level categories (intermediate), 6, 89 34, 35, 123-33 Belletti, Adriana, 62 asymmetries, 9-10, 33, 41, 74, 85, 114, 124-25, 130, 137-38n.l2, Binary and Singulary Generalized Transformations, 5, 175, 176 150-52, 156, 163, 165, 170, 171, 177-78 (see also Merge and Move) Chametsky's analysis of, 171-74, binary relations, 17, 27, 33, 107 branching, 27, 85, 134 196 and countercyclic Merge/Move, Merge and Move, 121-23, 157, 175, 176 139 189
190
C-command (continued) definition of, 8 derivational approach to, 7-10, 11, 14, 46, 61-62, 161-64 (see also Derivational Sisterhood) derivational definition of, 31-34, 36, 37-44, 61-62, 68, 74, 75-76, 123, 125, 131, 139, 163, 170-71 derivation of, 26-36 "irrelevant" relations, 125, 134-36 LCA reformulation and, 140, 141-44, 153,156, 157, 158 movement theory constraints, 100, 103, 147, 176 and nonprojecting Merge, 130 notion of closeness, 59-60 and other command relations, 106-7 as property of derivations, 25 reason for, 38 representational definition, 23-25, 29, 36, 37, 44, 106, 118, 123 representational definition abandonment argument, 34-36, 38 representational definition nonnaturalness, 171-74 Sisterhood as mutual, 90, 92, 94-95, 100, 104, 105, 108, 116, 117, 135 symmetrical (reciprocal) relations, 30, 92, 152, 155, 175 chain-formation analysis, 83n.55, 88 Chametsky, Robert A., 106-7, 171-74 Checking Domain, 6, 7, 14, 59, 86, 90, 93, 94,106, 107, 117, 136 Checking Relation, 7, 12, 84-85, 90, 91, 93, 94, 95, 108, 116-17, 158-59n.4 Derivational Sisterhood theory of, 155-56, 176 feature content alteration, 114 locality and, 96-100
INDEX
movement and, 106, 113, 119-21, 123 See also feature checking Chomsky, Noam, 3, 4, 5, 6, 10, 14, 16, 17, 18, 20, 21, 22, 27-28, 33-37, 43, 45n.5, 46, 61, 62, 63, 69, 70, 73, 74, 75, 98, 100, 130, 132, 134, 138n.l2, 139, 159nn. 11,12, 164, 172, 175, 177 and binarity of Merge and Move, 121-23 binding relations analysis, 47-53, 56, 60 and Case checking, 96, 125, 126 Condition of Inclusiveness, 76-77 and countercyclic operations, 144-45, 149 Extension Requirement, 144 feature-checking movement analysis, 84-85, 86, 89, 90, 93, 94, 106, 107, 119, 120 Global Economy, 111 and LCA, 139, 140-41, 150-55, 156 legitimate vs. illegitimate syntactic categories, 11-12 movement analysis, 58-60, 69, 101, 103, 104-5, 116, 120 X' invisibility hypothesis, 34, 35, 37, 123-28, 131 See also Minimalist Program closeness, notion of, 59-60, 118 Closest Mover, 118, 120 Closest Target, 118, 120 Collins, Chris, 80n.32, 100, 112, 116, 164 command relations, 17-18, 106-7 (see also C-command) complement Checking Relation, 96, 104 head relation, 38, 100, 108, 140 nonprojecting, 156 and specifier, 43-44
INDEX
complement clause, 50-51, 64, 134 complexity, 18 computational complexity, 135-36, 154-55, 157 concatenation (syntactic categories), 19, 26, 27, 30, 34, 37, 43, 61, 62, 63, 65, 66-67, 68, 70, 121, 134, 135 restrictions on noncyclic, 73-76 See also Merge and Move conceptual-intentional system, 7 Condition A, 64, 66-67, 72, 73 Condition C, 63, 64, 65, 66-70, 76 bound-variable interpretation and, 67-68 minimal link condition and, 69-70 noncyclic concatenation, 73, 74 Condition of Inclusiveness, 76-77 Condition on Bound-Variable Interpretation, 56-58, 67-68, 80n.31 Condition on Extraction Domain (CED), 75 construction-specific transformations, 3, 4 copy theory of movement, 48-49, 80n.30, 88 copy theory of traces, 88, 108n.3, 109n.5, 149, 153-54, 159n.lO, 175, 176 (see also trace theory) coreferential interpretation, 54, 66, 67, 68, 72, 74 Countercyclic Merge, 139-40, 142-49, 151, 154, 158 covert operation. See overt vs. covert operation cyclic and noncyclic rule, 50-52 application, 55-58, 64-68, 73-75, 76 C-command and, 134, 135 movement and, 102, 103, 105, 128-29
191
noncyclic concatenation restrictions, 73-76 strict cyclicity, 6, 10, 13, 100, 129 structure of cyclic derivations, 164-65 See also Countercyclic Merge derivation, 163, 169 derivational approach, 16-44, 161-78 to binding relations, 63-70 conceptual argument for, 177-78 consequences of, 11-12 constraints, 6 cyclic structure, 164-65 dominance definition, 166-68 economy concept, 111-14 to First Law, 44, 163 fundamental issues, 161-78 innovative properties, 13-14 to interpretative procedures, 46-77 Lebeaux's marking mechanism, 76-77 number of rule-applications, 111-12 preexistence, 44 to reconstruction asymmetry, 64-66, 70 significance of, 10 Spell-Out as inconsistent, 157-58 syntactic ordering and, 147 See also under C-command Derivational Sisterhood, 84-85, 89-108, 112, 119-20, 128, 145, 149, 150, 154, 155 Checking Relations theory, 155-56, 176 definitions, 96 as mutual C-command, 90, 92, 94-95, 100, 104, 105, 108, 116, 117, 135
192 Derivational Sisterhood (continued) and quantified syntactic relations, 111, 114-17, 133 See also Minimize Sisters "disappearing" X', 131-32 disjoint interpretation, 62, 64, 66, 74 Domain, 6, 47, 74 (see also Checking Domain; Internal Domain; Minimal Domain) Dominance, 163, 164, 165, 172, 174, 177, 178 derivational definition of, 166-68, 169, 170, 171 downward movement, 100-106 D-Structure, 18, 46 each other, 72, 73 economy metrics, 111-36, 135, 153, 176 (see also Minimize Sisters) embedded subject, 67, 69, 73, 128 Epstein, Samuel D., 11 explanation, 17 Extension Condition, 10, 139 Extension Requirement, 144 factorization, 172-74 feature bundle, 16 feature checking, 12, 85-87, 90, 103, 105-6, 116-17, 123, 136, 137n.8, 153, 175-76 adjacency in, 159n.11 Case and Agreement, 70, 96-99, 125-28, 129, 148 different analyses of outcome, 119-20 See also Checking Domain; Checking Relation feature deletion, 120 Ferguson, K. Scott, 98, 112, 116, 119 FF. See Formal Feature filters (representations), 5, 6, 11, 14, 20, 21, 37
INDEX
First Law of Syntax, 39-40, 43, 44, 163, 169-71 Formal Feature (FF), 84, 85, 90, 175 vs. Category Raising, 86-89 Free Projection, 33 Freidin, Robert, 35, 50 GB theory, 89, 90, 107 Generalized Transformations, 13, 18, 19, 20, 85, 101, 106, 163, 177 (see also Merge and Move) Global Economy, 111, 112 Government, 13, 17, 18-19, 37, 90, 110n.13 as unifying syntactic relation, 21-22 Greed, 5, 6, 85-86, 99, 100-101, 103, 106, 110n.l2, 116, 119, 148, 175 Groat, Erich, 26, 76, 97, 98, 112, 116, 119, 156, 158n.l, 160n.l4 Head-Complement relation, 38, 100, 108, 140 (see also Spec-Head relation) head-final languages, 156 head parameter, 155-57 hierarchical structure, 178 Holmberg, Anders, 97 Huang, C.-T. James, 185 Icelandic, 96-99, 100, 128, 137n.8 indexing, 47 INFL matrix, 59-60, 69, 70 interarboreal movement, 101-2, 108, 176 interface conditions, 5, 11-12, 14, 86, 177 intermediate-level categories. See bar-level categories Internal Domain, 6, 90 interpretation, 11-14, 46-77, 84, 157
INDEX
coreferential, 54, 66, 67, 68, 72, 74 as derivational process, 157, 176 disjoint procedures, 62, 64, 66, 74 minimization/nonminirnization, 54-55 version of binding theory, 47-48, 62, 84 invisibility of intermediate-level categories. See X'-invisibility hypothesis "is a" relation, 17, 22, 26, 27, 32, 37 "island" effects analysis, 83n.55 Jonas, Dianne, 97, 126, 128, 137n.8 Kawashima, Ruriko, 158n. 1 Kayne, Richard, 10, 14, 22, 35, 73, 74, 85, 113, 130, 131, 132, 137-38n.l2, 139, 140-41, 150-51, 156, 158, 179n.2 (see also Linear Correspondence Axion) Kitahara, Hisatsugu, 13, 18, 26, 31, 61, 70, 76, 80n.35, 83n.55, 98, 100, 158n.l Koizumi, Masatoshi, 156 Kural, Murat, 156 language-specific transformations, 3, 4 Lasnik, Howard, 4, 11, 19, 47, 56, 76, 78n.7, 80nn.31, 32, 100, 111 Last Resort, 11, 12 LCA. See Linear Correspondence Axiom learnability, 3-4 Lebeaux, David, 50, 55-58, 67, 71, 72, 82n.44 derivational analysis approach, 76-77 LF representations, 5, 6, 7, 12, 46, 47-48, 62, 76-77, 134, 175
193
binding relations and, 51, 52, 53, 56, 58 invisibility of bar-level categories and, 124 movement approach to anaphora, 48, 49, 50, 54-55, 66, 80n.30 as superfluous, 84 Linear Correspondence Axiom (LCA), 10, 35, 74-76, 85, 113, 129, 130, 131, 137-38n.l2, 177 discussion/reformulation of, 139-58, 175 linear order analysis, 73, 74-76, 85, 130 local economy, 112-20, 122, 128, 133, 136 locality, 90, 92, 96-100 marking mechanism, 76-77 matrix subject, 67, 69, 71, 72, 73 May, Robert, 56 M-command, 22, 90, 106-7, 117 Merge and Move, 3, 5-6, 7, 11, 12, 13, 14, 41, 85, 175 binarity of, 121-23, 157 C-command application, 8, 9-10, 23, 25, 26, 27, 29-36, 37, 84, 162, 163 concatenation, 61, 134 constraints, 113 contradictory instructions, 154 countercyclic, 139-40, 142-49, 151, 154, 158 as cyclic, 135, 141 -44 cyclic derivations, 164-65 Derivational Sisterhood relations, 114-16 distinguishing factors, 110n.12 local economy constraint, 112-13 in Minimalist Program, 18, 19, 43, 44, 61 nonprojecting vs. projecting, 129-33, 158n.2
194
Merge and Move (continued) as remerger, 149-50 semantic interpretation and, 148 sisterhood and motherhood relations, 27, 92 Structrual Description of, 27 of two or more terms, 122-23, 129-30 unification of, 174-76 See also Generalized Transformations; Remerge; Shortest Move; Singulary Transformations Minimal Complement Domain and Residue, 90 Minimal Domain, 13, 17, 18-19, 37, 90 Minimalist Program, 6, 11, 12, 13, 14, 60, 70, 80n.30, 84, 85, 89-90, 107 assumptions, 46-47 binding relations analysis, 47-60, 62, 63, 70, 84 competing derivational analyses, 76-77, 177-78 concatenation/pairing, 43, 62, 121 copy theory of traces, 175 innovations, 18, 19 linear order analysis, 75-76 role of inclusiveness, 123-24 syntactic relations encoding, 61-62 Minimal Link Condition, 59, 69-70 Minimal Phrase Structure Theory (MPST), 172-74 minimization of restriction, 52-53 Minimize Sisters, 85, 111, 117, 119, 120, 128-29, 132, 133, 176 computational complexity and, 154-55 effects of, 135, 136, 157 MLC, 59-60, 69
INDEX
motherhood, 27 Move. See Merge and Move; movement analysis; Shortest Move moved category. See copy theory of movement movement analysis, 58-60, 66, 69, 90, 145, 176 of complex categories, 149-50 copy theory, 48-49, 80n.30, 88 for feature checking, 116-17 and local economy, 112-20 and PBC restrictions, 100-101 and PRP, 152-54, 157 Shortest Move effects, 118-21 sidward and downward, 100-106 See also cyclic and noncyclic rule; Derivational Sisterhood; Merge and Move; phrasemarkers "must follow" relations, 165, 169, 170, 171 Muysken, Peter, 35 noncyclic clause, 51-52, 67 noncyclic concatenation, 73-76 noncyclicity. See cyclic and noncyclic rule nonprojecting complement, 156 nonprojecting Merge, 129-33, 158n.2 non-syntactic interpretation, 11 Numeration, 43, 159nn.lO,12, 164, 165 Nunes, Jairo, 153-54, 159n.lO, 180n.l0 6-Criterion, 50 O'Neil, John, 97, 160n.l4 operator position, 49-57 overt vs. covert operation, 139, 140, 141-44, 149, 157-58 P&P. See Principles and Parameters parsing, 15n.l, 19-20
INDEX
PBC. See Proper Binding Condition PF representations, 5, 6, 11, 12, 14, 46, 84, 87-88, 113, 122, 134, 152, 157, 175, 177, 178 (see also Linear Correspondence Axiom) phonetically null elements, 75 phonological interpretation. See interpretation phono-temporal order, 141 -44, 150 phrase-markers (representation), 3, 5-7, 11-13, 44, 125, 134 absence of intermediate-level categories in, 133 minimal complete factorization, 172-74 movement and, 101-3, 104, 105, 148, 176 separate, 28 phrase-structure rules, 4-5, 13, 35, 125, 141 Chametsky's theory, 172-74 derivational definition of, 178 See also tree structure pied-piping, 87-89 Poole, Geoffrey, 112 post-syntactic interpretation, 11-14 precedence. See Linear Correspondence Axiom precedence establishment, over- vs. underdetermination of, 150-54, 158 Precedence Resolution Principle (PRP), 152-54, 157, 158, 175 Preference Principle, 52, 56-57, 66, 80n.30 primitive constructs, 17-18 Principle-based theory, 21-22 Principles and Parameters (P&P), 3, 4, 6, 19 Procrastinate, 137n.8 projecting head, 156 Proper Binding Condition (PBC), 100, 108
195
PRP. See Precedence Resolution Principle quantification, 111 Quantifier Raising, 71, 136n.2 reconstruction asymmetries, 47-53, 63-66, 70, 175 recoverability of deletion, principle of, 52 Reinhart, Tanya, 8, 22, 24, 32, 37 relative clause, 50, 51-52 cyclic or noncyclic introduction, 56-58, 64-65, 67, 68, 73-75 Relativized Minimality, 59, 111 Remerge, 139-40, 148, 149-50, 151, 153, 154, 157, 158, 175, 176 representation, 13-14, 35 derivationality compared with, 177-78 merger-derived, 39 rule-free, 7-10, 12, 19-20 sisterhood, 85 types of definitions, 37 See. also filters; phrase-markers representational definition. See under C-command restriction, 52-53 Riemsdijk Henk van, 50 Rizzi, Luigi, 59, 62, 111 root node, 41 rule-application order of, 163 vs. representation, 7-10, 12, 19-20 Saito, Mamoru, 19, 76, 80n.32, 100 scope relations, 71 -73 semantic interpretation. See interpretation Shortest Move, 6, 59, 85, 98, 111, 118-21, 134, 136 sideward movement, 100-106
196 single-bar projections. See bar-level categories single-level analysis, 46-47, 53-60, 62, 63, 70, 76-77, 84 (see also Minimalist Program) Singulary Transformations, 18, 101, 103, 104, 175 (see also Merge and Move) sisterhood, 26, 27 (see also Derivational Sisterhood; Minimize Sisters) Spec, 40 Spec AgrSP, 96, 97 Spec CP, 89, 103 Spec-Head relation, 38, 85, 89, 123, 136 as Derivational Sisterhood, 90-96, 100, 108, 117 M-command and, 107, 116 specifier, 43-44, 88-89, 117, 131 Spec IP, 94 Spec VP, 35-36, 94, 95, 97 Spell-Out, 12, 140, 141, 142, 157-58, 159n.l0 Sportiche, Dominique, 110n.13 S-Structure, 12, 46 Standard Theory, 6, 20, 27 strict cyclicity, 6, 10, 13, 100, 129 Structural Change, 27, 37, 171 Structural Description, 27, 35, 37, 147, 171 Structure Building. See Merge and Move structure-building rules, 3-5, 158-59n.4, 160n.l4 asymmetry of, 178 C-command as property of, 94, 170, 177 concatenative, 135 constraints on, 111-36 cyclic, 164-65 derivational approach, 5-6, 62 filters mechanism, 5, 6
INDEX
Minimalist Program, 61-62 sufficiency for fundamental syntactic relations, 37 universal and unifiable, 14 wh-movement, 75, 89, 101, 119-20 subject, embedded and matrix, 67, 69, 71, 72, 73 superraising, 59-60 symmetric C-command, 30, 92, 152, 155, 175 syntactic relations blockers, 43 Concatenate A and B as formal expression of, 37, 61 (see also concatenation) derivational approach rationale, 7, 10-14, 20, 36-38, 161-78 derivational construal of First Law, 44, 171 derivational operations and ordering of, 147 derivation of, 16-44 examples of core, 16 fundamental construct, 16-20 illegitimate vs. legitimate categories, 11-12 intra-tree laws, 39-41, 171 Minimalist Program encoding of, 61-62 in principle-based theory, 21-22 as rule-application property, 7-10, 12 single-level (minimalist) approach to, 11, 46-47 single-level analysis problematic cases, 53-60, 62, 63, 70, 84 structure-building operations and, 37 See also First Law of Syntax; sisterhood; tree structure
197
INDEX
theory of movement. See movement analysis Theta Theory, 20 timing, 44 trace theory Merge and Move unification and, 174-76 See also copy theory of traces transformational rule, 3, 4-5, 44 applied to C-command, 25, 36, 44, 172 interface conditions, 5 Standard Theory, 27 Structural Change and Description, 37, 171 universal conditions, 4, 5, 34 See also Generalized Transformations; Merge and Move; Singulary Transformations Transitive Expletive Construction, 97 tree structure, 7, 21, 44 formal representation, 27-29
interarboreal movement, 101-2, 108, 176 syntactic intra-tree laws, 39-41, 171 "unconnected tree" law, 44 trinary relations, 17 Turkish, 156 unification, 17 universal syntax, 3 Ura, H., 128, 137n.8, 159n.11 variable. See Bound-Variable Interpretation wh-islands, 118-21 wh-movement, 75, 89, 101, 119-20 Williams, Edwin, 50 X'-invisibility hypothesis, 34, 35, 37, 123-28, 129, 131-33, 137-38n.l2 X' Theory, 4, 5, 7, 10, 20, 21, 35, 40 Y-model, 12, 162-63