Studies in Contemporary Phrase Structure Grammar This book explores a wide variety of theoretically central issues in t...
195 downloads
1708 Views
2MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Studies in Contemporary Phrase Structure Grammar This book explores a wide variety of theoretically central issues in the framework of head-driven phrase structure grammar (HPSG), a major theory of syntactic representation which is becoming increasingly dominant, particularly in the domain of natural language computation. HPSG is a strongly lexicon-driven theory, like several others on the current scene, but unlike the others it also relies heavily on an explicit assignment of linguistic objects to membership in a hierarchically organized network of types, where constraints associated with any given type are inherited by all of its subtypes. This theoretical architecture allows HPSG considerable fiexibility within the confines of a highly restrictive, mathematically explicit formalism, requiring no derivational machinery and invoking only a single level of syntactic representation. The separate chapters consider a variety of problematic phenomena in German, Japanese, and English and suggest important extensions of, and revisions to, the current picture of HPSG. Robert D. Levine is Associate Professor in the Department of Linguistics at Ohio State University and Georgia M. Green is Professor of Linguistics at the University of Illinois, Urbana-Champaign.
Studies in Contemporary Phrase Structure Grammar
R O B E RT D. L E V I N E AND GEORGIA M. GREEN
The Pitt Building, Trumpington Street, Cambridge CB2 1RP, United Kingdom The Edinburgh Building, Cambridge, CB2 2RU, UK http://www.cup.cam.ac.uk 40 West 20th Street, New York, NY 10011– 4211, USA http://www.cup.org 10 Stamford Road, Oakleigh, Melbourne 3166, Australia © Cambridge University Press 1999 This book is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published 1999 Printed in the United Kingdom at the University Press, Cambridge Typeset in 10–12 /12 pt Jansen MT [] A catalogue record for this book is available from the British Library Library of Congress cataloguing in publication data Studies in contemporary phrase structure grammar / edited by Robert D. Levine and Georgia M. Green. p. cm. Includes examples in English, German, and Japanese (romanized). Includes bibliographical references and index. ISBN 0-521-65107-7 (hb) 1. Head-driven phrase structure grammar. I. Levine, Robert, 1947– . II. Green, Georgia M. P158.4.S88 1999 415—dc21 98-36961 CIP ISBN 0 521 65107 7 hardback
Contents
Introduction Georgia M. Green and Robert D. Levine
page 1
1
The lexical integrity of Japanese causatives Christopher D. Manning, Ivan A. Sag, and Masayo Iida
39
2
A syntax and semantics for purposive adjuncts in HPSG Michael J. R. Johnston
80
3
On lexicalist treatments of Japanese causatives Takao Gunji
119
4
“Modal flip” and partial verb phrase fronting in German Kathryn L. Baker
161
5
A lexical comment on a syntactic topic Kazuhiko Fukushima
199
6
Agreement and the syntax–morphology interface in HPSG Andreas Kathol
223
7
Partial VP and split NP topicalization in German: an HPSG analysis Erhard W. Hinrichs and Tsuneko Nakazawa
275
Index
333
Introduction1 . University of Illinois . Ohio State University
1
Overview
Head-driven phrase structure grammar (HPSG) has evolved as a synthesis of ideas from a number of theoretical sources, including generalized phrase structure grammar (GPSG), categorial grammar, and formal theories of data structure representation (e.g., the PATR-II formalism and Kasper-Rounds logics of computation). In the course of more than a decade of development these theoretical resources have been applied to a monostratal theory of linguistic structure which is capable of providing a formally explicit grammar for any given natural language. HPSG uses a fundamental theoretical strategy made familiar by GPSG: the enumeration of a class of objects, corresponding to expressions of some natural language, and a set of constraints whose interaction enforces the appropriate covariation of formal properties reflecting the dependencies that any grammar of that language must capture. A head-driven phrase structure grammar of some language defines the set of signs (form/meaning correspondences) which that language comprises. The formal entities that model signs in HPSG are complex objects called feature structures, whose form is limited by a set of constraints – some universal and some language-parochial. The interaction of these constraints defines the grammatical structure of each such sign and the morphosyntactic dependencies which hold between its subcomponents. Given a specific set of such constraints, and a lexicon providing at least one feature structure description for each word in the language, an infinite number of signs is recursively characterized. 1
Green’s work was supported in part by the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign.
. .
2
It is useful to distinguish three phases in the evolution of what may be thought of as a “classical” theory of HPSG: the version presented in Pollard and Sag (1987), informally referred to as HPSG-I; chapters 1– 8 of Pollard and Sag (1994) defining HPSG-II, and chapter 9 of Pollard and Sag (1994) containing enough revisions of the theory offered in the preceding part of the volume to constitute a separate version of the theory, HPSG-III. As HPSG has become adopted by an increasing number of investigators, there have been numerous innovations and emendations grafted onto this basic theory, many of which are proposed, illustrated or defended in the contributions in this volume. At present, the theory is in an intense and fertile period of development which precludes the possibility of a straightforward unitary treatment, but the following chapters afford a useful point of departure for those who wish to acquaint themselves with current thinking in this framework. To assist readers in this pursuit, we offer in section 2 below a basic introduction to some of the leading concepts of the classical theory, and in section 3 outline the enrichments, extensions, and revisions to this theory contributed by each of the papers in this collection.
2
Fundamentals of HPSG
2.1
Feature structures, signs, and types
2.1.1 Feature structures and feature structure descriptions All linguistic objects (including both expression types, and the abstract objects that are invoked to describe them) are modeled in HPSG as feature structures.2 A feature structure is a complete specification of all the properties of the object it models. Feature specifications consist of a value of the appropriate kind, or type, for each required attribute or feature. In other words, all attributes of the object being modeled must be specified. Feature structures themselves are represented as directed graphs, not necessarily acyclic, subject to certain formal restrictions.3 A schematic example of a feature structure is given in (1):
2
3
For discussion see Pollard and Sag (1994: 8, 17–18), and for background, Shieber (1986), Pollard and Moshier (1990), Carpenter (1992). For example, they must be totally well-typed, which means that they are complete models rather than partial models (i.e., constraints or descriptions) of the objects they represent. Feature structures must also be sort-resolved, which is to say that all values must be the maximal (most specific) ones possible; i.e., every node q in the feature structure must be labeled by a sort with no subsorts, and every path must terminate in an atomic sort, one with no attribute declarations.
Introduction
phrase
SY N
SE M
(1)
DT RS
headed-struc
HEAD-DTR
CONTENT
local
3
word SYNS EM
npro
CO
M
P-
D
T
RS
nil
CON
TEN T
INDEX
local
X INDE
3
NUM
index
R PE
sg
npro
G
EN fem
This (radically oversimplified) graph, reflecting a typical HPSG-II representation, is, very roughly speaking, a (partial) representation of the nonbranching, headed phrase structure (2): (2)
XPi Xi
The feature structure in (1) reflects the following information: the sign in question is of subtype phrase, whose head daughter is of subtype word, i.e., a lexical sign, specified for a feature as part of its semantic attributes, indicated by the feature ; furthermore, as the convergence of the arrows indicates, the of the head daughter is explicitly identified as being the same thing as the index of the mother phrase itself. The graph representation of feature structures is awkward to display and tedious to interpret, so, as a convenience, feature structure descriptions in
. .
4
the form of attribute value matrices (AVMs) are commonly used instead. Attribute or feature names are typically written in upper case in AVMs, representing feature structure descriptions, and values are written to the right of the feature name, in lower case as in (3). (3)
3 sg fem
Types of feature structures inherit all of the attributes and type restrictions on their values from all of their supertypes.4,5 Feature structures are the entities constrained by the grammar. It is crucially important to distinguish between feature structures (fully specified objects) and feature structure descriptions, objects that (partially) describe feature structures. Feature structure descriptions characterize classes of objects. For example, the NP she could be represented by a fully specified feature structure (representable as a directed graph), but “NP” is (an abbreviation for) a feature structure description, and, under the restrictions described in note 3, could not be so represented. Put another way, a partial description is a constraint on members of a class of feature structures, while a total description is a constraint which limits the class to a single member. For the most part, grammar specification deals with generalizations over classes of words and phrases, and therefore with (partial) feature structure descriptions. 2.1.2 Signs and their attributes As already noted, the primary object of linguistic analysis in HPSG is the sign, which models the association of form and meaning. Signs belong to one of two types: word and phrase. An act of uttering a linguistic expression corresponding to a particular sign is an act of producing a noise that corresponds to the phonological properties of that sign, with the intent that the product of that act be understood as intended to have syntactic, semantic, and contextual properties corresponding to the respective attributes of that sign. The sign itself is an abstract structured object with phonological, syntacticosemantic, and contextual attributes, expressing different kinds of properties of the sign. 4
5
In previous formulations of HPSG, this inheritance is strictly monotonic; adding information must never entail the revision of specifications. In recent work by Sag (1997) this requirement is relaxed. The set of feature-structure types in a grammar is a partial subsumption ordering, i.e., a transitive, reflexive, and antisymmetric relation on the subsumption relation. Thus, the division of the broad class of signs into words and phrases noted above represents the fact that the type sign subsumes both phrase and word. In fact, since the specifications for phrase and word are mutually exclusive (phrases have attributes which specify their immediate constituents, and words don’t), the types phrase and word the type sign.
Introduction
5
The feature system employed in HPSG echoes GPSG feature theory in a number of respects, but there are some significant differences. One particularly evident difference between the two frameworks is that HPSG’s feature geometry is considerably more ramified. In HPSG, all signs are assumed to have and attributes, recording their phonological and syntacticosemantic structures, respectively.6 The value of the attribute is a feature structure which represents the constellation of properties that can be subcategorized for. It has a attribute, whose value has , , and attributes, and represents what is shared by filler and gap in so-called extraction constructions. It also has a attribute, whose values constrain all types of unbounded dependency constructions (UDCs). The attribute takes as its value a category, whose attribute has as its value a part of speech and whose valence attributes , , and each have a list of synsems as their value-type. An - feature is a property of all words. Its value is a list of synsems denoting the sign’s arguments, ordered by obliqueness, and it contains the obliqueness record that is invoked in constraining binding relations (cf. Pollard and Sag 1994: ch. 6). The valence attributes take over the saturation-tracking function of the HPSG-II feature. The value of the feature is (depending on which part of speech the value is) a nominal object, a quantifier or a parameterized-state-of affairs (a psoa). A psoa is (roughly speaking) a representation of a (possibly open) proposition. Psoas form an elaborate type subhierarchy, with kinds of relations naming subtypes and determining what argument-denoting attributes they have, as illustrated in (4).7 (4)
nominate index index
persuade index index -- psoa
A nominal object, by contrast, corresponds to the logical representation of a common noun (although it is useful, in certain cases, to take common nouns to have semantic values representable as psoas, particularly when the noun denotes an event rather than an individual, as discussed in Michael 6
7
values are usually represented in standard orthography, solely for the sake of convenience and readability. The representation of propositional content as psoas does not reflect an essential property of HPSG. It would make no difference if some other kind of coherent representation of a semantic analysis was substituted, as long as it provided a way of indicating what properties can be predicated of which arguments, how arguments are linked to individuals in a model, and how the meaning of each constituent is a function of the meaning of its parts. In other words, the exact form of the representation is not crucial as long as it provides a compositional semantics.
6
. .
Johnston’s paper in this volume). It is expressed in a feature structure representation containing an attribute, with values of type index, and a attribute, whose value is a set of psoas. Indexes in turn have attributes for , , and . For perspicuity, in abbreviated AVMs, index values are often represented as subscripts on category designations: NPi, for example, or NPthere . A psoa-valued specification is similarly abbreviated following a colon after a category designation; :n represents a VP with the content n. Finally, the attribute records indexical information (in the values of the , , and features) and presuppositions in the psoa-set value of the attribute. Linguistically relevant information that is generally considered pragmatic is supposed to be represented in the value of the attribute. For some discussion, see Green (1995). 2.1.3 Types The requirements of well-typedness and sort-resolution (see note 3) entail that grammars must be complete and explicit about what kinds of features are required to properly characterize an object of any given type, and also what kind of objects (i.e., what types) are appropriate values for any given feature. The attribute of a sign, for example, can have, among other possible types, an object of type nonpronominal (or npro) as its value, and npro-type objects require, inter alia, a specification for an value. Thus, a sort is defined by a declaration of the attributes (features) it has, and the value-types of those features. Feature declarations are represented as labeled attribute-value matrices, AVMs, as illustrated in (5), where Fi are feature names and sorti are sort names. (5)
sort0 F1 sort1 F2 sort2 ... Fn sortn
Sort declarations specify what attributes an instance of the sort has, and what kinds of things the values of those attributes can be, and sometimes what particular value an attribute must have (either absolutely, or relative to the value of some other attribute). For two sorts, a and b, a is a subsort of b iff it is dominated by b in a hierarchical classification of sorts generally referred to as the sort hierarchy; sorts which label terminal nodes in the sort hierarchy are termed “maximal sorts” because they are maximally informative or specific. Constraints on feature structures are expressed in terms of featurestructure descriptions, and can therefore take full advantage of underspecification and subsumption relations. What is implicit in sort definitions (including lexical specifications) or in universal or language-specific constraints does not have to be expressed in the representations of linguistic objects. For
Introduction
7
example, since the Head Feature Principle requires that the value of the head daughter of a phrase be the same as the value of the phrase itself, the details of this value only need to be indicated once in each representation of a phrase. The notion of the values of two attributes being the very same object is modeled in feature structures as the sharing of structure, as illustrated above in (1). In referring to token-identity, and not just type-matching, - is a crucial property of HPSG which does not have a direct counterpart in other syntactic theories. It amounts to the claim that the value of some instance of an attribute is token-identical to the value of some other instance of an attribute, i.e., it is the – not a different thing which happens to have all the same properties. As indicated in (1) above, structure-sharing is represented in feature structures as a convergence of arrows (sometimes referred to as re-entrancy), whereas in AVMs this kind of token-identity is shown via recurrence of – boxed integers like 3. Thus, the following three AVMs are equivalent descriptions of the feature structure in (1):8 (6)
a.
3 || 1 sg fem -||| 1 - nil
b. || 1
-||| 1 - nil
3 sg fem
c. || 1 sg
-||| 1 - nil
3 fem
All three descriptions convey the same information, since there is a only one way to satisfy the token-identities in the three descriptions.9 As noted above, types, or sorts, are organized hierarchically in the logic of HPSG. For each local subtree in the type hierarchy, the sorts which label the daughters partition the sort which labels the mother; that is, they are 8
9
In the following AVMs we employ the conventional representation of feature-name pathways in which [A [B [C x]]] is represented as A|B|C x. For reasons of perspicuity, sometimes values are labeled with the name (in italics) of the sort that structures their content, but such information is usually omitted wherever possible. Note that in certain recent work is not employed; rather, - and - are “top-level” attributes of phrasal signs.
8
. .
necessarily disjoint subsorts which exhaust the sort of the mother. For example, subsorts of head can be any of a number of part-of-speech types, of which some are both further partitioned, as illustrated in (7), following Sag (1997). (7)
head noun
verbal verb
adjective
preposition determiner
...
complementizer that
for
Because a type can have more than one super-type, the theory allows for multiple inheritance, which enables types to be cross-classified. For example, partitions of the sorts constituent-structure (i.e., head-complement, head-adjunct, head-filler, . . . ) and clause-type (i.e., declarative, interrogative, relative, . . . ) and their subsorts cross-classify clausal structures so that in Sag (1997), for example, a subject-relative clause like who loves Sandy is both a head-complement structure and a type of relative clause, and an unprefixed relative clause like Sandy admires is also a head-complement structure, and a different type of relative clause, while who Sandy admires is also a relative clause, but a head-filler phrase, rather than a head-complement phrase. In the inheritance hierarchy in (8), words and phrases, as subsorts of sign have values which are lists of phonstrings, and values which are synsems, and headed structures have a head daughter, while coordinate structures have a list of coordinate phrases as daughters. (8) sign
list (phonstrings) synsem word
[-
phrase
headed-phr coord.-phr singleton-list(signs) ] [- list( phrases) ]
In further partitions, the - is additionally specified as a word or a phrase, and other kinds of daughter attributes may be specified. The values for the various daughter sorts are list-valued so that they can be specified as empty. Although the theory of features in HPSG owes much to the earlier work on GPSG cited earlier, HPSG admits a larger set of value types for features. In HPSG, a feature’s value belongs to one of four possible types:
Introduction
• • • •
9
atom feature structure set of feature structures10 list of feature structures11
If a value is not specified in a feature-structure description (i.e., an AVM), the value is still constrained by the type-definitions to be one of the possible values for that feature. That is, underspecification or nonspecification of an attribute amounts to specifying a disjunction of values allowed by the degree of underspecification. Thus, specifying either NP[] or NP amounts to specifying NP[ sg ∨ pl ], and so on, for all the possible attributes of NPs (i.e., all the features they can have).
2.2
Constraints and structure-sharing
An HPSG grammar consists of a set of constraints on the form of signs consistent with the constraints on the values of the features that are defined for them. Just as in GPSG, the range of possible phrase structures can be taken to be defined in terms of constraints on the well-formedness of the class of objects admitted by a particular grammar; the various phrase structure schemata that an HPSG for some language admits are in effect just very general restrictions on the combinatoric possibilities of linguistic objects. Unlike GPSG, and the theories of phrase structure which preceded it, however, HPSG does not treat constituent-structure trees as formal objects, although they remain a convenient graphic representation of the immediate constituents and linear order properties of phrasal signs. Instead, constituent structure is represented by the various attributes of phrasal signs. In informal representations, nodes are labeled by analyzable category names displayed as AVMs, and linear order is imposed.12 Beyond the constraints implicit in the phrase structure possiblities of the grammar, there is a variety of further restrictions. Some constraints on possible signs are inherent in the inheritance of the hierarchy of sorts. A handful of others depend on the notion of structure-sharing, explained in section 2.1.3 on types, to constrain feature-value correspondences between sisters, or between mother and some daughter, for particular features. These include 10
11
12
Set values are represented as sequences within curly brackets: {1, 2}. The empty set is denoted: { }, while {[ ]} denotes a singleton set. List values are represented as sequences within angled brackets: 〈1 [nom], 2[inf ]〉. The empty list is denoted: 〈 〉, while 〈[ ]〉 denotes a singleton list. The various constituent-structure types defined by the handful of ID-Schemata that in Pollard and Sag (1994) constituted a “disjunctively specified principle of universal grammar” (1994: 38) may be considered as just a further elaboration of the type hierarchy for the type phrase.
. .
10
familiar principles like the H-Feature Principle (which constrains the value of a phrase to be the same as the value of its head daughter) and the Valence Principle, as well as some form of N Feature Principle which governs the projection of the unbounded dependency features (, , and ).13 Principles which constrain the value of a phrase to be a particular function of the values of its daughters, depending on what subtype of phrase it is, are specified in the sort declarations for particular subsorts of phrase. The Valence Principle (a reformulation of the Subcategorization Principle of Pollard and Sag 1994: chs. 1– 8) constrains objects of the sort headed-phrase so that the value of each feature corresponds to the respective valence value of their head daughter, minus elements that correspond to values of -, -, and -. In effect, the Valence Principle says that the , and values of a phrase correspond to the respective , and values of its head daughter except that the values specified for this daughter that correspond to any -, -, and - values respectively are absent from the respective valence attributes of the phrase itself. The H-Feature Principle, described above, is likewise represented in (9) as a constraint on headed phrases. (9)
headed-phrase ||| 1 |- [||| 1]
Just as heads select arguments by valence features, adjuncts select heads via a feature , and determiners select heads via a feature . The use of the structure-sharing notation to express generalizations can be seen by examining a few particular cases. For example, the valence of the raising verb tend is represented as in (10). (10)
1 VP
inf 〈1〉
This constraint says that tend needs as a subject whatever its VP complement needs as its subject. It specifies tend’s value as identical to the value of the VP which tend selects as its complement. Similarly, (11) represents a description of the valence of a raising verb in a structure where it happens to have a quirky-case infinitive complement, as in, for example, Icelandic. 13
Recent work suggests that each of these features may require its own respective principle regulating its propagation throughout a sign. See Sag (1997).
Introduction
11
(11) 〈1 NP〉 VP inf 〈1[ gen]〉 The structure-shared values entail that the the subject of the raising verb must have whatever case the subject selected by the VP complement has. Finally, the structure-sharing required by the description of a topicalization structure, given in (12), requires the | value of the filler daughter to be the same as the single member of the value of the head daughter. (12)
phrase
verb | 〈〉 〈 〉 -|{1} -| 1 |{ }
-
2.3
Constituent structure
Information about the constituent structure of phrases (as well as information about the relation of the constituent parts to each other) is recorded in the phrase-type, or in the various (or ) attributes (-, -, -, -, -, - (-)). Apart from - these features are all list-valued, enabling them to be present but empty, though some are limited to being no longer than singleton lists. Thus, a description like (13) indicates a tree with three daughters: a verb head daughter, and two complement daughters (an NP and a PP). (13)
phrase give a book to Sandy verb synsem noun 〈〉 || 〈 〉 〈 〉 - 〈 〉 - 〈 NP, PP〉 - 〈 〉
Alternatively, the same information could be represented in a more elaborated type hierarchy, in the declaration for a type head-comps-ph, as in (14).
. .
12
(14)
head-comps-ph give a book to Sandy verb synsem noun || 〈〉 〈 〉 〈 〉 -- 〈NP, PP 〉
Sometimes linguists find it convenient to translate these descriptions into annotated tree diagrams where nodes are labeled by category descriptions in the form of AVMs for signs (or abbreviations for them), and arcs are labeled with relations as in (15) or (16). (15)
phrase gave a book to Sandy 2 verb 1 H word
gave 2 1 〈3, 4〉
(16)
phrase
a book noun 3 〈 〉
C phrase
to Sandy prep 4 〈 〉
VP H
2.4
C
C
C
V
NP
PP
gave
a book
to Sandy
Constituent order
The general outlines of the HPSG approach to constituent order derive from the theory of linear precedence rules sketched in GPSG (Gazdar et al. 1985, Pullum 1982), and discussed at some length in Pollard and Sag (1987). It was envisioned that linear precedence constraints would be constraints on the values of phrases with content along the following lines:
Introduction
• • •
13
A lexical head precedes all of its sisters. Fillers precede phrasal heads. Less oblique complements not headed by V precede more oblique phrasal complements.
As serious grammar development for a number of languages (especially notably, German and French) has made abundantly clear, word order constraints are not always compatible with the semantic and syntactic evidence for constituency, and the exact form of the resolution to this dilemma constitutes a lively topic in current research. Dowty (1996), Kathol (1995), and Reape (1996) on the one hand, and Nakazawa and Hinrichs (this volume) on the other, represent two (not necessarily incompatible) approaches that continue to be explored. As in GPSG, so-called “free word order” (i.e., free phrase order) is a consequence of not constraining the order of constituents at all. (Genuinely free word order, where (any) words of one phrase can precede (any) words of any other phrase requires a word-order function that allows constituents of one phrase to be interleaved with constituents of another (Pullum 1982, Pollard and Sag 1987, Dowty 1996)).
2.5
Equi, Raising and expletive pronouns
As in GPSG, infinitive complements are treated as projections of verbal heads, and infinitival to is widely treated as a Raising verb, along the lines of e.g. tend.14 Equi and Raising structures are both projections of heads which subcategorize for an unsaturated predicative complement, and indeed, have the same constituent structure – either (17a) or (17b), depending on the verb. (17) a.
S NP
VP V S
b. NP
VP V
14
VP
NP
VP
Sag (1997) offers a slightly different analysis of to which, however, converges with the auxiliary verb account for purposes of the following discussion.
14
. .
Pre-theoretically, the difference between Raising verbs and Equi verbs is that Raising verbs have an argument to which they don’t assign a semantic role, while Equi verbs assign roles to all their arguments.15 In Pollard and Sag (1994) this difference is represented by specifying that Equi verbs subcategorize for an NP with an index (of type ref, i.e., not an expletive) which is the same as the index of the specification of its complement, and assigns a semantic role to the index of the coindexed NP, as indicated in (18a), while a Raising verb takes as its subject the token-identical feature structure that its complement VP selects as subject, but assigns no semantic role to the index of that element, as indicated in (18b). (18) a. intr-equi-verb 〈NP1 〉 〈VP [inf, 〈NP1 〉]:2〉 try 1ref - 2 b. intr-raising-verb 〈1〉 〈VP [inf, 〈1〉]:2〉 tend - 2 The absence of a role assignment for one subcategorized element for Raising verbs entails that the content of that argument has no semantic relation in the Raising verb’s clause. Assignment of a role to the index of an equi verb’s subject entails that sentences with passive Equi complements will have different truth-conditional semantics from ones with active complements. By the same logic, sentences with active and passive Raising complements will have the same truth-conditional semantics. The restriction that the Equi controller have an index of type ref follows from the assignment of a semantic role (because roles in psoas have to be of type ref or type psoa (Pollard and Sag 1994: 397)). This precludes the possibility of expletive Equi controllers. Structure-sharing between the valence values in Raising constructions predicts the possibility of “quirky” case on Raising controllers (as illustrated in (11) above), and the restriction of PP Raising controllers to those which can plausibly be identified as referential.16 15
16
The position that subcategorized constituents are just those to which semantic roles are assigned by heads, instantiated as the Theta Criterion in Chomsky (1981) and much subsequent work, has been strongly challenged on empirical grounds in Postal and Pullum (1988), and at least some recent work in the Government-Binding (or “principles and parameters”) framework has dropped this assumption about the relationship between syntactic form and semantic interpretation, as, for example, in recent work by John Moore (1996). Thus, PPs which denote locations can appear in Raising constructions like Under the bed seemed to be the best place to store the beer.
Introduction
15
Semantic roles are assigned only to psoas and indexes of type ref. Consequently, roles are never assigned to expletives, and role-assigned arguments are never expletives.17 But some verbs do subcategorize for expletive subjects, for example: • • •
“weather” expressions (it): rain, late, Tuesday . . . existential verbs (there): be, arise, occur, . . . extraposition verbs and adjectives (it): seem, bother, obvious . . .
In fact, Postal and Pullum (1988) have shown that some predicative expressions subcategorize for expletive objects. For example, transitive idioms like wing, go at, out of . . . require an expletive object, as do extraposition predicates like resent, take, depend upon . . . , which require a sentential object in addition. The HPSG analysis is that the expletive it has a [ 3, sg] index of type it, the expletive there has a [ 3] index of sort there, and both index sorts are subsorts of the sort index, along with the subsort ref. The appearance in there-constructions of agreement between the verb and its first object follows from specifying that the verb subcategorizes for a direct object whose value is shared with that of its there subject. Agreement is, as usual, a linking of some form of the verb with the value of the index of the subject it subcategorizes for. The Subcategorization Principle interacts with raising structures to allow the subcategorization requirements of verbs recursively embedded in raising structures to be satisfied by an indefinitely higher subject.
2.6
Unbounded dependencies
2.6.1 Extractions The general outlines of the HPSG treatment of unbounded extractions follow the three-part strategy developed in GPSG (Gazdar 1981, Gazdar et al. 1985): 1. 2. 3.
licensing an extra constituent, just in case it matches a missing constituent (represented as the value of the extraction-recording feature ) ensuring that the missing constituent will indeed be missing recording the correspondence between the extra constituent and the constituent that is missing by means of local (mother–daughter) correspondences over an indefinitely large array of structure
HPSG incorporates a distinction, defended in several studies (e.g., Hukari and Levine 1987, 1991, Cinque 1990) between strong and weak extractions. In strong extraction constructions, the extra constituent has all the categorial properties expected on the missing constituent; thus, the head daughter’s 17
These facts interact with the Raising Principle requiring subcategorized elements that are not required to be an expletive to receive a semantic role (unless a nonsubject which subcategorizes for that element is also subcategorized for by the same term (Pollard and Sag 1994: 140–142)). This principle makes several correct predictions about raising structures.
16
. .
values for share structure with the value of a nonargument filler daughter. In weak extraction phenomena, a constituent that is the argument of some element must be coreferential with the missing constituent; thus, only coindexing is required (not full categorial identity) between some constituent and the value of on another constituent.18 The operative difference is that in strong extraction, the value of the two elements must match; in weak ones it need not (Pollard and Sag 1994: 187). That is, is specified on arguments independently of case specified for the missing constituents, as in phrases like those in (19): (19) a. b. c. d.
John is easy to please–. (tough-complements) I am available to dance with–. (purpose infinitives) I gave it to the man Dana thinks– is French. (that-less relative clauses) It’s me who Dana says– is ferocious.19 (it-clefts)
In HPSG, the extra constituent in strong extractions is licensed by a schema (or, alternatively, a sort declaration) which defines head-filler clauses. In weak extraction cases such as tough-constructions, a head of the relevant class selects a complement with a nonnull specification; this entails that some descendent of this complement will not be lexically realized.20 The nature of this nonrealization is still a matter of some dispute. Missing constituents were treated in GPSG and HPSG-II as traces, i.e., lexically defined phrasal constituents of various category types which in each case were missing a constituent of exactly that type; thus an NP-trace is NP[ NP], a PP-trace is PP[ PP], a PP[to]-trace is PP[to, [ PP[to]], a 3rdsingular-NP-trace is NP[ sg, 3, [ NP[ sg]], and so on. Traces were licensed in phrases by rules which define, for each lexical element that subcategorizes for one or more phrasal sisters, a corresponding item with a phrasal trace sister. (In GPSG this is accomplished by a , a rule which defines rules on the basis of other rules. In HPSG-III it is accomplished by a lexical rule which defines lexical entries which lack certain elements on valence lists, and have corresponding elements in their sets.) We discuss more recent alternative proposals for empty categories in section 2.6.3. Missing subjects, which are not sisters of lexical heads, have been licensed in both GPSG and HPSG by a lexical rule which lets a lexical head that would ordinarily subcategorize for an S instead subcategorize for a VP (i.e., an S that lacks a subject), just in case its mother is missing an NP. This treatment has what was at one time regarded as the happy consequence of entailing the 18
19 20
Weak extractions correspond to empty operator movements in the GB literature on extraction phenomena. Note that here there is no correspondence of values for either or . There are unresolved problems involving the interpretation of quantifiers in some weak types of extraction. For example, the feature geometry of HPSG on Pollard and Sag’s (1994) account fails to predict the ambiguity of A good man is hard to find and predicts a widescope interpretation for a good man.
Introduction
17
familiar that-trace facts (Gazdar et al. 1985: 57–162, Pollard and Sag 1994: 384). However, a number of facts have more recently been seen to converge in favor of treating subject extraction as simply another instance of the same filler–gap relation as is seen in complement extraction.21 For example, the that-trace effect has been shown to vanish when material bearing a phrasal stress (such as an adverbial phrase) intervenes between the complementizer and the site of the missing subject (a point noted in passing in Bresnan 1977 and more recently rediscovered and investigated by Culicover, e.g., 1993).22 If this is the correct interpretation of such facts, then the theory of UDCs in HPSG takes on a less heterogeneous appearance. The correspondence between the extra constituent and the missing constituent (precluding the possibility of sentences with an extra NP and a missing PP) is guaranteed by a universal constraint on the occurrence of unbounded dependency features. Such features, called “instantiated foot features” in GPSG, and features in HPSG, are constrained to appear on a phrase if they are present with the same value on at least one daughter, and on a daughter constituent only if they are present with the same value on the mother. Thus, the match between the extra constituent and the missing constituent is, as Gazdar et al. elegantly put it, “a global consequence of a linked series of local mother-daughter feature correspondences” (Gazdar et al. 1985: 138). Chapter 9 of Pollard and Sag (1994) develops HPSG-III with some interconnected revisions, as suggested by Borsley (1987, 1989, 1990), to the accounts of subcategorization and extraction. First of all, adding the list-valued , , and features allows subject arguments to be distinguished from complement arguments, and determiners from subjects.23 Obliqueness continues to be recorded on the feature (now called -), as needed for the account of binding and for principles of constituent order, but saturation is now recorded in the three valence features , , and according to the Valence Principle, which in effect distributes the Subcategorization 21 22
23
Cf. Bouma et al. (1998). Similarly, Hukari and Levine (1998) argue that certain examples cited in Haegeman (1984), of the form Dana is someone who until you get to know can seem quite strange, must be interpreted as parasitic gaps and thus require the theory to license expressions of the form wh[S ([XP . . .– i) – i VP], indicating the need to treat the subject gap on a par with argument gaps. Borsley offered three sorts of reasons. First, he observed, treating complements as a distinct subset of argument phrases makes it possible to characterize all phrases as [ 〈 〉]. Second, the separation of subjects from other arguments obviates an unfortunate artifact of treating subjects as just initial valence elements, and correctly allows objects of prepositions like on in depend on Sandy to be treated as complements rather than as subjects. Finally, he argues that the parallelism between Welsh finite-verb inflection and prepositional objects, which both treat pronoun subjects and objects differently from nonpronominal ones, with supporting facts from clitic doubling, show that in Welsh, pronoun subjects are complements. This generalization couldn’t be expressed if subjects were distinguished from objects only by obliqueness order. Pollard and Sag suggest (1994: 357) that this sort of account might be appropriate for flat-structure languages generally, unless subjects have to be distinguished from nonsubjects for syntactic reasons, and by means other than relative obliqueness.
18
. .
Principle over the three valence features. This allows the - feature to record what amounts to argument structure, or f-structure, regardless of which arguments are actually instantiated as syntactic constituents. 2.6.2 Multiple extractions In contrast to many other treatments of extractions and other unbounded dependencies, in HPSG, unbounded dependency features including take a set of feature structures as values, not a single feature structure.24 This correctly predicts the existence of sentences with multiple (non-coreferential) extractions, as in (20). (20) a. A violini this well-crafted, even the most difficult sonatai will be easy to play– j on– i . b. This is a problemi which Johnj is tough to talk to– j about– i . c. These girlsi, these giftsj are easy to give– j to– i . d. These giftsi, these girlsj are easy to give– j – i . As with GPSG, HPSG allows multiple binding of a single extracted element as in (21) by not saying anything to prevent it. (21) a. That was the rebelX leader whoi rivals of– i shot– i . b. Those reportsi , Kim filed– i without reading– i . c. Which relativesi should we send snapshots of– i to– i ? Such structures satisfy the N Feature Principle of HPSG-II, or equivalently, the S Amalgamation Constraint of HPSG-III.25 Pollard and Sag (1994) note a much wider range of acceptability judgements for such constructions than is usually acknowledged.26 24
25
26
In addition, the type of the feature structure that is the value differs among features, so that takes a set of local objects (values of the attribute ), while takes a set of indices, and takes a set of nonpronouns (a subsort of nominal objects). The N Feature Principle required that the value of any nonlocal feature on a phrase be the union of the values for that feature on the daughters, minus the - feature on the head daughter. The S Amalgamation Constraint (Sag 1997) achieves the same effect word-internally on -binding heads like tough-predicates, without invoking an feature for bookkeeping. It tracks -binding in head-valence phrases through heads: a S Inheritance Principle ensures that the value of is the same as its head daughter’s value. For example, sentences like (i) are routinely cited in scholarly discussions as unacceptable (in support of a claim that gaps in non-heads are dependent on coreferential gaps in a head). In fact, however, they are perfectly acceptable to many native speakers. (i) That is the rebel leader whoi rivals of– i shot the British consul. and after considering various alternatives, Pollard and Sag conclude that for some speakers there is a parochial Subject Condition which stipulates that the first element on a lexical head’s argument-structure list may contain a gap only if something else on the list does. A different perspective is offered by Cinque (1990) and Postal (1994), who claim that the fact that it is only NP “extractions” that are involved in so-called parasitic gap constructions and extractions from adjuncts shows that neither are extraction phenomena at all, but rather cases of anaphora involving null resumptive pronouns. The empirical basis of this alternative approach is challenged in Levine et al. (to appear).
Introduction
19
2.6.3 Empty categories Apportioning the saturation-tracking functions of the original feature to the various features allows two kinds of invisible objects to be eliminated: neither extraction traces nor so-called null or zero pronouns need to be represented as abstract or invisible constituents. Extractions are represented as structure-sharing between a member of a lexical head’s set, and the value of an element that is on its - list, but not on its or lists. The lexical rules (Pollard and Sag 1994: 446 – 451) which define this relation have the form of a function (schematized in (22)) from lexical entries to lexical entries that are identical except that they contain on their - list an element whose value is the same as its value; an element with the same value is absent from the list; the set is augmented by that value.
a. b. c. (22)
1 - 〈 . . . , 3 1, . . . 〉 - . . . , 4 ,... {1} , 〈 . . . , 3, . . . 〉 〈 . . . 〉 {2} {2 ø 1}
Note that 4 is the same as 3, except that the specification is added, which entails a different tag; this formulation is consistent with the idea that specifications of the range of a lexical rule are the same as in the domain except where specified to be different.27 Thus, the value of ate in Bagels, Kim ate would be represented as in (23). (Here, NP stands for “synsem appropriate for a NP”.) (23)
〈4〉 〈 〉 - 〈4 NP1, 5 NP [ 32]〉 eat 1 2 | {3}
The core idea, as this example illustrates, is that contrary to long-held standard assumptions across a variety of frameworks, heads that appear to be missing arguments necessary to saturate them, but which seem to receive this saturation from arbitrarily distant fillers, actually have different valence requirements from their homophonic variants which are missing none of 27
In important unpublished work, Bouma et al. (1998) replace the use of lexical rules with relational constraints which allow a given lexical entry to characterize just the set of signs that in Pollard and Sag (1994) are licensed by the various extraction lexical rules introduced there. The result is a far more unified treatment of extraction than has been available in previous verions of HPSG.
20
. .
their arguments. Rather, each combination of overt and ostensibly missing arguments actually corresponds to a different lexical item with its own valence specification, and the “missing” elements are simply those that the head in question does not select (though such elements do appear as part of the argument structure specification of the head via its - list. For each gap associated with an argument of a head, the lexical rule reduces the head’s valence by one, corresponding to the missing element. Null pronouns are also treated as being selected via a head’s argumentstructure list, but not as being phonologically realized, and for this reason do not appear on any valence feature list. Since they are not in a syntactic dependency relation with anything, they are not represented in sets either. Null pronouns are represented on the argument-structure list, because they behave the same as phonologically realized pronouns with respect to binding, and binding is a relation among elements on argument-structure lists (see section 2.7). Because zero-pronouns are represented on the argumentstructure list and in , but not on or lists, expressions containing them do not count as being unsaturated on that account. To illustrate, the representation of ate in an answer utterance like Ate squid (cf. Japanese Ika-o tabeta) is sketched in (24): (24)
〈〉 〈5〉 - 〈NP1, 5 NP2〉 eat 1 2 | { }
In contrast, the treatment of implied arguments remains unchanged from HPSG-I. Implied arguments (e.g., the implied argument of ate in I already ate) are distinct from zero-pronouns in having generic or nonspecific rather than definite referents. They are represented in the presence of a role in the representation, but are absent from - and valence feature lists and sets. The nonspecific reference is attributable to the fact that no index is specified for the relevant role in the value, as illustrated in (25). (25)
- eat | { }
〈4〉 〈〉 〈4 NP1〉 1
Introduction
21
2.6.4 Adjunct extractions There appears to be good cross-linguistic evidence that adjunct extraction must be treated by the same mechanisms which establish syntactic connectivity between extracted arguments and their gap sites, regardless of how such sites are treated by the theory. Hukari and Levine (1994, 1995) show that in languages like Irish, Icelandic, Yiddish, Kikuyu, and Chamorro, which morphosyntactically flag filler–gap pathways, adjunct extraction displays the same properties as argument extraction, so that the linkage between filler and gap cannot be, as has occasionally been suggested (e.g., Hegarty 1991, Cattell 1978, Liberman 1973), merely a semantic effect whereby the scope of the modifier is construed over some internal domain. They also adduce evidence from phenomena such as weak crossover that in English, adjunct and argument extraction are a unified syntactic phenomenon. Pollard and Sag (1994) come to the same conclusion, demonstrating the inadequacy of attempts to explain adjunct extractions as parenthetical insertions, using examples such as those in (26), which show that what would have to be treated as parenthetically inserted is not only an incomplete clause, but indefinitely variable, can contradict the meaning that would be conveyed by the rest of the sentence (as in (26d–f ) – unlike uncontroversial parentheticals), and can even conclude with a complementizer (as in (26d,e)). (26) a. b. c. d.
When did they eat dinner? When do you think they ate dinner? On Friday, I suppose they will go to the opera. When their parents are in town next week, I doubt that the twins will attend any lectures. e. When their parents are in town next week, I don’t for a moment think that the twins will attend any lectures. f. From 1980 through 1993, the Senator denies even being contacted by an industry lobbyist.
Example (26f ) shows long-distance extraction from a nonfinite, subjectless clause, where no parenthetical analysis is even imaginable: if any or all of the substring the senator denies even is eliminated, what remains is not a well-formed sentence. It thus appears that adjunct extraction is unbounded, although subject to semantic and pragmatic constraints. This is difficult to reconcile with the analysis discussed above in which traces are eliminated by reinterpreting extraction as valence reduction on heads; since adjuncts have been assumed to be unselected by heads, there seems to be no valence specification for them that can be reduced. Although the specific approach to adjunct extraction taken in Pollard and Sag (1994) encounters serious technical problems, a more recent strategy has emerged which explores the possibility of treating adjuncts as a subtype of selected material. There is some evidence from French and German that
. .
22
certain adverbs are actually sisters of the head rather than adjoined to some phrasal projection of it, and there is additional evidence from French which seems to suggest that some of these sister adverbials are in fact selected by the head (see, e.g., Kasper 1994). It is possible, therefore, that in general adverbial material, like other dependents of the head such as subjects and complements, is recorded among the dependents of the head, and, if so, exactly the same mechanisms may operate in the case of adverbs as with complements (and, as argued in Hukari and Levine (1998), subjects as well; see Bouma et al. (1998) for further discussion). 2.6.5 Constraints on gaps Research in syntax since 1964 has sought to identify environments in which filler–gap linkages are barred and formulate these as independent conditions (Chomsky 1964, Ross 1967). However, as early as the 1970s, alternative explanations began to be fleshed out for the facts these syntactic constraints were supposed to account for (e.g., Grosu 1972). More recently, it has become apparent that much of the data on which many of these universal and abstract constraints have been based are spurious; a somewhat different choice of lexical items or replacement of, e.g., a definite article by an indefinite article significantly improves the acceptability of the examples taken as evidence. In other cases it has become clear that the phenomena used to support universal claims are in fact language specific. Nonetheless, certain broad classes of effects do emerge from the treatment of “extraction” constructions in the theory. Thus, for example, treating extraction gaps as resulting from a lexical property28 which constrains the realization of certain arguments predicts most of the (generalized) left-branch phenomena discussed by Ross (1967), Gazdar (1981), and Pollard and Sag (1994: 175). Likewise, the Conjunct Condition of Ross’s, (1967) Coordinate Structure Constraint (precluding the extraction of only one of a set of conjuncts) is also a consequence of the N Feature Principle and the fact that extracted items must be (arguments of ) subcategorized elements. At the same time, the Element Condition of Ross’s Coordinate Structure Constraint, which only permits across-the-board extractions from conjoined constituents, now seems not to be a syntactic constraint at all in the face of such acceptable sentences as (27) (as noted in Goldsmith 1985 and Lakoff 1986). (27) a. Concerts that short you can leave work early, hear all of– and get back before anyone knows you’re gone. b. This is the whisky which I walked all the way to the store, and I paid for– with my own money, and now you tell me you didn’t need it after all! c. Who is the actor who childhood pictures of– and drawings by Bob Dylan were featured on an album cover? 28
See Pollard and Sag (1994), and Sag (1997) for two approaches to achieving this effect.
Introduction
23
If there were, however, some reason to represent the Element Condition syntactically, it would just be the addition of the boldface clause in an independently needed Coordination Principle along the lines of (28): (28) In a coordinate structure, the (and NONLOCAL) value of each conjunct daughter is an extension of that of the mother. As for Ross’s Complex NP Constraint (preventing extraction from noun complements and relative clauses), it has been known for decades that the noun-complement cases are often completely acceptable. Consequently, any constraints on them are pragmatic, not syntactic, in nature. In fact, the HPSG treatment of extraction predicts that in the general case “extractions” from the clausal complements of nouns will be syntactically well-formed, since the finite clause is just a complement of the noun ( fact, proposal, idea . . . ), and nothing prevents extraction of arguments from complements.29 Relative clauses, on the other hand, do seem to be strictly and syntactically constrained not to contain gaps, and analyses in both HPSG-II (Pollard and Sag 1994: ch. 5) and HPSG-III (Sag 1997) must in effect stipulate that the value set in a relative clause is a singleton whose index matches that of the relative pronoun.30 There are some remaining stubborn cases which do not fall out of any of the general HPSG conditions and which cannot be shown to have been erroneously classified as ill-formed. Pollard and Sag argue that of the two major classes of parasitic gaps, only the prohibition on unsupported extraction out of subjects is real; examples like (29), discussed since the mid-1980s at least, 29
The standard examples taken to exemplify the restriction in question were extractions from complex definite NPs like the claim that we vetoed the bill. This was quite natural, since the most reliable strategy for identifying the content of a proposition is to uniquely specify that proposition, so the default usage would be a definite example like (i) rather than an indefinite one like (ii). (i) the belief that Ollie was a spy (ii) a belief that Ollie was a spy But as is well known (though not well understood), definite NPs tend to make linking a gap to a filler outside the NP awkward-sounding. With pragmatically realistic examples involving indefinite NPs, the Complex NP effect disappears, as in examples like (iii): (iii) Which Middle Eastern country did you hear rumors that the CIA had infiltrated?
30
See Pollard and Sag (1994) for discussion and further examples. Note that this formulation is a good deal more restricted than Ross’s original constraint even with respect to relative clauses. Ross’s formulation mistakenly predicts that (i) is bad, when it is in fact quite acceptable for many (possibly most) speakers. (i)
Robini we all love– i , but Terryj I don’t know who likes– j .
Pollard and Sag (1994) correctly predict (i) to be well-formed, since the authors take who here to be an in situ subject. Hukari and Levine (1998) present arguments for vacuous wh extraction, and an analysis along these lines is incorporated in the integrated approach to such extraction presented in Bouma et al. (1998). But there is no reason to believe that such extraction is obligatory; the existence of in situ wh in other positions suggests that there is no a priori reason to deny wh subject the possibility of in situ occurrence.
. .
24
are taken to point to the spuriousness of any general prohibition on unsupported extraction from adjuncts. (29) Which papers did you go to England without signing? Pollard and Sag (1994) articulate a Subject Condition for those idiolects where speakers reject sentences with gaps within subjects, as in (30). (30) That is the rebel leader whoi rivals of– i shot the British consul. This represents the fact that for some speakers, such gaps must be supported by another gap within the VP which selects the subject containing the gap.31 It might appear at first that Ross’s Sentential Subject Constraint (SSC) is a subcase of the Subject Condition. But this is not correct. First, having an additional, nonsubject gap doesn’t improve SSC violations: (31) *Pat, that Andy hated– irritated– . Second, as indicated, not all idiolects of English are constrained by the Subject Condition, but uncontroversial SSC violations are rejected by all speakers. Thus, whatever the status of the Subject Condition, the Sentential Subject Condition seems to require its own separate, idiosyncratic statement pretty much along the lines given in Ross’s thesis.32
2.7
Binding
The HPSG account of binding phenomena starts from the premise that theoretical inconsistencies and documented counterexamples to familiar binding theories require sentence-internal dependencies between referential noun phrases and coreferential reflexives, reciprocals, and personal pronouns to be stated in a way which does not make reference to syntactic configurations. The HPSG account was developed to account for the following facts: •
anaphoric personal pronouns (“pronominals”) and reflexives (“anaphors”) are in roughly complementary distribution, insofar as reflexives must have a clausemate antecedent, and pronouns may not have one (but note the following exceptions to this general pattern).
31
Even this formulation is problematic, because of examples provided in Ross (1967), such as These are the cars of which only the hoods were damaged in the explosion. It was precisely because of such examples that Ross restricted his condition on extractions from subjects to clausal subjects. Note, however, that it is not just clausal subjects which block this extraction: all phrases headed by verbs or complementizers (verbals in Sag 1997) display the same property, and the same pattern holds for gerundive NPs:
32
(i) *Lou, to argue with makes me sick. (ii) *Lou, to argue with infuriates even friends of. (iii) *Lou, arguing with infuriates even friends of.
Introduction
• • •
25
Reflexives in picture-NPs can have antecedents in higher clauses. Reflexives in picture-NPs can have antecedents outside the sentence. Reflexives with plural reference can be bound to noun phrases which jointly do not form a syntactic constituent.
The HPSG account is framed in terms of constraints on relations between coindexed subcategorized arguments of a head (i.e., between objects on an argument-structure list which have the same value).33 On Pollard and Sag’s account, extraction gaps will have the same index-type as the filler they are bound to, since what is structure-shared in the unbounded extraction dependency is a value, and the value for is a part of a value. The HPSG account of binding is stated in terms of obliquenessbinding (o-binding), which is dependent on the notion of obliquenesscommand (o-command), defined in terms of the obliqueness relation which orders the on an argument-structure list. For objects Y and Z, Y is less oblique than Z iff Y precedes Z on the - value of a lexical head. For objects Y and Z with distinct values, and Y referential, Y locally o-commands Z iff • Y is less oblique than Z, • or Y locally o-commands some X that subcategorizes for Z. For objects Y and Z, with distinct values, Y referential, Y o-commands Z iff • Y is less oblique than Z, • or Y o-commands some X that subcategorizes for Z, • or Y o-commands some X whose value is token-identical to that of Z. Y (locally) o-binds Z iff • Y and Z have the same index • and Y (locally) o-commands Z. Z is (locally) o-free if Z is not (locally) o-bound. Considering its nonconfigurational approach, the HPSG binding theory nonetheless looks pretty familiar: A. B. C. 33
A locally o-commanded anaphor must be locally o-bound. A personal pronoun must be locally o-free. A nonpronoun must be o-free.
This approach was pioneered more than a decade ago in work by William O’ Grady, utilizing a version of categorial grammar (1987).
. .
26
However, it differs crucially from typical configurational accounts in that it has an inherently narrower scope. Principle A does not constrain all anaphors to be locally o-bound (that is, coindexed to something before them on an argument-structure list), but only those which are locally o-commanded (i.e., those which are noninitial). This makes strong, vulnerable, and apparently correct claims. First, coindexed pronouns which are initial elements on argument-structure lists are unconstrained – free to be anaphors, coindexed to anything, and vacuously satisfying Principle A, or to be pronouns, substantively satisfying Principle B. Thus, the theory predicts that pronoun objects in these “exempt” conditions which are coindexed to anything anywhere in a higher clause, or outside the sentence, can be either anaphors or pronouns.34 This is correct; the reflexives that contradict the naive versions of Principle A are generally replaceable with pronouns with the same reference. Second, because nonpredicative (“case-marking”) prepositions have a value which is structure-shared with their object (since the preposition makes no contribution to the meaning), the prepositional phrase is a nominal-object of the same sort as its object, and constrained by the binding theory just as if it were an NP. Thus, in contrast to a configurational binding theory, they pose no problem; when their nominative and accusative NPs are coindexed with each other, depends on requires an anaphoric accusative object and disallows a prounoun, just as trust does. Third, a pronoun or anaphor cannot have a nonpronominal “antecedent” in a lower clause because the coindexing would put the nonpronominal in violation of Principle C. Fourth, the analysis of extraction gaps predicts that the missing element is of the same sort as the filler, and therefore predicts that (32a) is a Principle C violation, while (32b) is not. 34
The following kinds of phrases are thus exempt: •
•
•
•
pre-nominal possessives (These are determiners, with values equivalent to NPs, but they are the unique items on the argument structure lists of Ns, so they are not subject to Principle A, since they are not locally o-commanded.) (i) Bush and Dukakis charged that Noriega had contributed to each other’s campaigns. objects of (prepositions in) picture-NPs (These are also a unique item on an argument structure list, and so not locally-o-commanded.) (ii) a. The children thought that pictures of themselves were on sale. b. I suggested that portraits of themselves would amuse the twins. c. John knew there was a picture of himself in the post office. objects, when the subject is expletive (The would-be o-commander is not referential, but o-command is not defined for nonreferential types, therefore the next item on the list is not locally o-commanded.) (iii) a. They made sure that it was clear to each other why Kim had to go. b. John knew that there was only himself left. accusative subjects (As subjects, they are not locally o-commanded. Therefore they are exempt, and anaphors are allowed.) (iv) a. John wanted more than anything for himself to get the job. b. What John would prefer is for himself to get the job.
Introduction
27
(32) a. *Johni , hei said you like ti . b. Himi , hei said you like ti . Finally, the HPSG account of binding phenomena predicts that with multiple complements of the same verb, more oblique arguments cannot bind less oblique ones, regardless of their relative phrase order, so that (33a) and (33b) are correctly predicted to be unacceptable since the anaphor goal phrase is less oblique than the nonpronominal about-phrase. (33) a. *Marie talked about John to himself. b. *Marie talked to himself about John. The HPSG account of binding phenomena does not attempt to explain all constraints on the dependency between an anaphor or pronoun and its antecedent in terms of o-binding. Pollard and Sag (1994) note at least two sorts of nonsyntactic factors that appear to enhance or degrade acceptability. First, acceptability judgments that are sensitive to whether a potential (unintended) antecedent intervenes between a pronoun or anaphor and a target antecedent are arguably due to processing factors, rather than grammatical constraints.35 Second, insofar as point-of-view is reflected in the choice between personal pronoun and reflexive when both are allowed, the pointof-view has to be consistent within a single sentence. The HPSG binding theory is still incomplete, however, in that it does not take into account the interaction between various arguments and adjunct constituents under coreference. For example, in the case of Principle C we find asymmetries between subject and object pronouns vis-à-vis possible coindexings with NPs within adjuncts, but no such asymmetries with respect to NPs inside complement clauses. Thus, Coindexing with an object is possible into an adjunct clause as in (34a), but not into a complement clause as in (34b). (34) a. You can’t even greet themi without the twinsi getting offended. b. *You can’t tell themi that the twinsi are horrid. At the same time, coindexing of a subject into either kind of clause is impossible, as shown in (35a,b), (35) a. *Theyi never do anything without the twinsi attracting attention. b. *Theyi say that the twinsi are inseparable. In other words, when object cataphora as in (34) is involved, adjuncts act as though they were outside the domain of the binding theory, which is exactly what one would predict on Pollard and Sag’s general approach, since ocommand relations do not hold between a complement and an adjunct – there is no obliqueness relation between them. On the other hand, we would expect the same thing to follow when subject cataphora was involved (as in (35)), yet 35
This point has been noted by a number of syntacticians, among them Grinder (1970, 1971), Kimball (1971), and Jacobson and Neubauer (1976).
28
. .
here the theory mispredicts, because subject cataphora is systematically impossible even when the coindexed NP is within an adjunct. This problem, discussed briefly in Hukari and Levine (1996) and at length in Hukari and Levine (1995b) and Bouma et al. (1998), is only one of several intersecting puzzles involving adjuncts, arguments, and extracted elements in the HPSG binding theory, and it is clear that all three phenomena require more consideration before there can be a satisfactory resolution.
3
Overview of the volume
The issues and analyses that we have reviewed in the preceding sections constitute the starting point for an increasingly voluminous literature exploring the mutual impact of HPSG and a broad spectrum of empirical data from scores of human languages. Inevitably, as the tenets of the theory that reflect properties of more familiar languages have been confronted with less wellknown phenomena, many extensions and revisions to HPSG’s theoretical apparatus have been introduced and defended, and new conceptions of valence, phrase structure, and numerous other components of the theory have appeared and found application. The papers in this collection bear witness to both the breadth and depth of the continuous rethinking that the theory has undergone since the appearance of Pollard and Sag (1994). A number of difficult issues of argument structure and constituency in German are investigated in a pair of papers (Hinrichs and Nakazawa, and Baker) both of which focus on a complex of interrelated phenomena which most prominently include the topicalization of incomplete verb phrases and the multiple ordering possibilities of sequences of verbs with respect to certain governing auxiliaries. The two papers share a number of common perspectives on the analysis of German clausal structure and the nature of lexical entries for auxiliaries that interact with the structural possibilities allowed in German. For example, the authors all assume (i) that (following earlier work by Uszkoreit, Nerbonne, Pollard, and others) German clauses typically have relatively little hierarchical structure, and (ii) as argued in the earlier work by Hinrichs and Nakazawa cited in their paper, German auxiliaries inherit the complement structure of the verbs they select, in addition to the normal structure-sharing of subjects which is familiar from English. This widely influential analysis, which has been applied to Romance languages, Korean and others, in effect allows the argument structure of auxiliaries to be quite underspecified, a function in each case of the particular verb that the auxiliary combines with. The result of such lexical items in combination with the Valence Principle is the collapsing of hierarchical structure within the VP into a single string of arguments of an arbitary number of selected heads (each of which in turn takes as its own arguments the arguments of its selected V complement, and which in turn passes selection of these arguments to the auxiliary which selects it, a situation often referred to as argument inheritance). The result is that a number of signs, corresponding
Introduction
29
to the arguments of different verbs, now appear as sisters within the clause, and are mutually orderable by means of LP statements. Clausal structure becomes flatter still, in both papers, through the adoption of Pollard’s suggestion (following work by Borsley on Welsh) that finite German clauses have null specifications, and that the argument corresponding semantically to the subject is in fact the least oblique element on the list. The difficulties considered in these two papers arise in connection with the appearance of incompletely saturated, verb-headed constituents within clauses. This in itelf would not necessarily be a significant problem for the assumed flat structure of such clauses – ample evidence has been adduced, since McCloskey’s pioneering work on Irish in the early 1980s, that Celtic VSO languages license VP constituents, and similar results have been obtained for other languages. The difficulty with the German data is that VPs that appear as topics need not be fully saturated objects. Rather, they may appear with elements missing that would be required to satisfy the valence of their heads; these missing elements are left apparently in situ within the main clause to which the (partial) VP is linked as a filler. English has a few types of construction which parallel the German situation; thus, example (36) apparently involves an AP filler missing a complement of the head. (36) How easy is Robin to please? But Hinrichs and Nakazawa argue that in German the missing material is still part of the VP, and cannot be plausibly treated as involving extraposition, e.g. their example in section 5 of their paper: (37) Ein Märchen erzählen wird er seiner Tochter müssen. a fairytale tell will he his daughter must He will have to tell his daughter a fairy tale. The presence of the auxiliary müssen, apparently in situ and to the right of seiner Tochter, makes it difficult to argue plausibly that seiner Tochter is linked to a gap site within the fronted VP Ein Märchen erzählen, as vs. to please in the English case, since if extraposition were involved in the German example we would expect seiner Tochter to appear at the very end of the sentence. The question then arises as to the nature of the linkage between the partial VP on the left and the apparently saturating NP at the end of the sentence, and both the Hinrichs and Nakazawa and Baker papers offer characterizations of the relationship between the incomplete contents of the VP topic and the corresponding arguments that appear within the main clause. It is in the actual mechanisms proposed to account for this linkage that the two papers differ. Hinrichs and Nakazawa treat the fronted partial VP (PVP) as a constituent completely saturated for , with the missing material recorded in the set specification of the V head, in association with a lexical rule mapping ordinary auxiliaries into verbs ed for PVPs containing certain gaps, and – crucially – selecting as complements constituents whose values are structure-shared with the missing elements. Thus,
30
. .
the elements missing from the PVP are of the same kind as those involved in extraction, but the gaps involved are never actually filled; instead, the missing material is recorded on the list of auxiliary verbs and appears in main clauses as part of the saturation of those verbs, rather than as fillers of the topic-internal gaps. The authors further show that very much the same sort of approach provides an account of the phenomenon they refer to as split NP topicalization, in which portions of an NP appear to be topicalized, leaving behind various remnants, but where connectivity failures between the fronted material and the remainders in the main clause point away from a simple extraposition-type linkage. Baker’s analysis explores a phenomenon touched on briefly in Hinrichs and Nakazawa’s paper, called modal flip, whereby certain modal auxiliary heads appear to the left of a cluster of verbs that would ordinarily be complements of such auxiliaries via argument inheritance, rather than to the right as is normal. In Baker’s analysis, this phenomenon is linked with the PVP fronting cases considered by Hinrichs and Nakazawa because, as she argues, the verb clusters involved in modal flip turn out to be PVPs. In her view, however, PVPs are not saturated elements containing gaps, but rather unsaturated constituents whose valence is related to the valence of the head and its complements in a more complex way than the simple concatenation relation familiar from the Valence Principle. Baker’s account, like Hinrichs and Nakazawa’s, matches an auxiliary element, which shares its lexical verbal complement’s valence specification, to a matching item which is ed for a PVP and which selects just the material from that PVP, but in her analysis the missing material is recorded in the specification rather than the value of the fronted constituent. The differences in the two treatments lead to number of different consequences, both empirical and theoretical, but both analyses take full advantage of the considerable expressive power of lexical rules, yielding words whose lexical specifications give rise to constructions mimicking filler–gap linkages without utilizing the connectivity mechanisms such linkages depend on. A second set of papers in this collection also presents alternative accounts of basic clausal structure in a single language – in this case Japanese – which agree on a substantial number of points but explore different theoretical technologies to capture the range of facts adduced. Thus, Fukushima argues in his paper on topicalization that -wa topics in Japanese are not adjuncts on V projections, as argued in work he cites by Gunji, but are actually syntactic arguments of Vs, though not necessarily semantic arguments of these Vs (a type of treatment that mirrors similar claims about the treatment of certain Japanese adjuncts in the paper by Manning et al., as well as much recent work in English and Romance, e.g. Bouma et al. 1998); adjuncts as well as complements can be topicalized in this way. On this proposal, topics are part of the - list of heads. Fukushima proceeds to defend the claim that several relative clause phenomena which appear to violate constraints on extraction are explicable if certain topics interpreted with scope inside an extraction island
Introduction
31
are syntactically arguments of a predicate higher than that island, via one of the topicalization lexical rules he proposes. By formulating this rule so that it only allows NPs to be topicalized (but allowing the other rule to apply to any XP), he derives the differential behavior of different kinds of topicalization with respect to the types of categories that can be topicalized. Furthermore, by his account he is able to motivate the fact that a constituent can be construed as the topic of a coordination of verbs, one of which has its “normal” argument structure (by virtue of the topic substitution rule) and the other of which has an “enhanced” argument structure (by virtue of the topic addition rule), suggesting that the argument structures of the two verbs are identical (since they share the same argument daughters and hence must be “co-saturated”). In the course of the argument, Fukushima considers certain possible counterevidence, involving word order facts and nonfeeding relations among lexical rules, and offers accounts of these data which render them compatible with his proposal. He also considers whether these topics could be taken to be adjuncts rather than arguments, and concludes that this would be less preferable since (i) they cannot be conjoined with uncontroversial modifying adjuncts although the latter can be conjoined with each other and (ii) they cannot be modified by adverbial modifiers. The focal problem for the other two papers on Japanese is the structure of Japanese causative constructions, which exhibit a problematic duality: certain diagnostics for structure point unequivocally to a monoclausal structure for these causatives, whereby the causative morpheme is treated as a suffix to a stem and all NPs in the clause are selected by that stem, whereas other tests equally strongly indicate the need for a biclausal representation. As both papers note, the key to this discrepancy is the nature of the diagnostics used: criteria that invoke (morpho)phonological patterns, such as reduplication and derivational morphology, argue for monoclausality, while those that are conventionally assumed to depend on semantic or syntactic structure support biclausality. Whatever solution is employed, therefore, entails some kind of dual structure, corresponding to a mismatch among components of the representation, and between them the two papers show the kind of representational devices called for, depending upon where the mismatch is located. In the paper by Manning et al., the morphological, phonological, and hierarchical syntactic properties of clauses are taken to be in close correspondence. The discrepancy between the two classes of diagnostics alluded to above is located in the relationship between these properties on the one hand and argument structure on the other: complex predicates formed morphologically from stems have a single set of valence specifications corresponding to a multi-level - list containing as one of its elements another - list whose membership is token-identical to a subset of the head’s valence elements. Thus, coresponding to a set of complement sisters . . . , NP[ 1], NP[ 2], NP[ 3] . . . for some complex predicate, we can expect to find a specification [- 〈1, . . . , [- 〈2 , 3〉] . . . 〉, . . . ]. The
32
. .
argument structure so depicted has in common with syntactically biclausal constructions a number of binding and scoping phenomena, under suitably revised theories of adjunct configurations (where adjuncts are reinterpreted as complements that can take scope over the verbal projections in which they appear as daughters) and quantifer propagation and retrieval. These analytic innovations jointly reconcile a flat morphosyntactic representation with the apparent evidence for complex syntactic relations. Takao Gunji’s paper starts from a different premise, viz., that there is a straightforward correspondence between syntax and argument structure, and that the mismatch is rather between morphophonology and syntax. In constructing his alternative analysis, Gunji relies on a number of ideas first introduced in work by David Dowty and Mike Reape, published as Dowty (1996) and Reape (1996) respectively and developed in considerable detail in Reape (1993) and Kathol (1995). The two crucial technical concepts that undergird Gunji’s solution are domain union and attachment. Domain union is a relation on strings of constituents dominated by a certain category C, defined in terms of the operation called sequence union, or shuffle, whose formal definition Gunji reproduces from Reape’s work as (38), where s denotes sequence union, ξ the empty list, X, Y, and Z arbitrary lists, and α|β the append of lists α, β: (38) a. s(ξ, ξ , ξ) b. s(X, Y, Z) . s(a|X, Y, a|Z) c. s(X, Y, Z) . s(X, a|Y, a|Z) The first clause of this definition identifies the sequence union of two empty lists as being the empty list, and serves as the formal basis step for the definition. The next two clauses roughly speaking indicate that if the sequence union of two lists is some third list, then the sequence union of one of the original lists with an extension (via a prefixing of one or more elements) of the other is the original sequence union extended in exactly the same way. When the consequences of the definition are worked out, what emerges is that (38) in effect shuffles two lists together, so that in the sequence union Z of X and Y their list memberships are arbitrarily interleaved, but subject to the condition that the order of elements in Z from X are in the order imposed on X, and similarly for Y.36 Suppose now that the elements 36
The objects licensed by this somewhat cryptic definition are not intuitively obvious, and the best way to gain an insight into its denotation is to apply it recursively to pairs of lists of increasing length. As noted in the text, the basis in this recursion is provided in (i), from which it follows that singleton lists 〈a 〉 and 〈b 〉 are jointly in the sequence union relation with {〈a, b〉, 〈b, a 〉}; specifically, taking into account that 〈a 〉 = 〈a|ξ 〉, we have s(〈a〉, ξ, 〈a〉), therefore by (38c) s(〈a〉, 〈b|ξ 〉, 〈b, a 〉), etc. It is then evident that s(〈a〉, 〈c, b 〉, {〈a, c, b 〉, 〈c, a, b〉, 〈c, b, a〉}). Note that the order of c and b is invariant in the lists enumerated by this definition. Each additional application of the definition to successively larger lists will yield a set of lists where the elements from X and Y respectively appear in Z in the same order applying to them in X and Y, just as shuffling two packs of cards interleaves the cards but does not change their relative order with respect to the original decks.
Introduction
33
in the list are (partial descriptions of ) feature structures – specifically, those corresponding to heads and their sisters. It is clear that for any two phrases whose daughters constitute lists, it is possible to define some attribute of the mother (frequently called in the literature based on Reape’s approach to the linkage between constituency and linear order) as the sequence union of those lists. If this attribute is, either directly or indirectly, linked to the phonological form of the linguistic expression in question, we have a situation in which two structurally distinct phrases representable formally as trees may be realized as any combination of the phonetically realizable units associated with each phrase, subject to the ordering condition just stated and possibly others. An example of the latter is Dowty’s notion of attachment, whereby two elements in the attachment relation cannot be separated by any other. With these technical tools in place, Gunji’s strategy for capturing the duality of the Japanese causative is immediately apparent: the syntactic representation, taken as the configurational expression of the valence properties of heads, is indeed biclausal, but is mapped into a morpho/phonological string in which elements belonging to different syntactic constituents form units with respect to morphological and phonological rules. Thus, on his account the relationship between syntax and argument structure is a morphism in at least one direction, but the relationship between syntax and morphophonological structure is not, and affixes on stems may belong – as in the case of the honorification prefix o – to quite different constituents from the stems to which they are affixed. This kind of solution seems in some respects to be a monostratal analogue of one or another avatar of transformational morphology, and indeed raises comparable issues about lexical integrity, morphology-free syntax and the like. But it is important to note that such solutions are easily accessible within the expressive capacity of HPSG’s formalism, and are a reminder that in this framework as in all others, issues of grammatical representation are open questions not just at the level of word combinations, but word-internally as well. The paper by Kathol directly addresses the latter point, examining the morphology/syntax interface at one of its most universal and, in many respects, problematic loci, viz., the phenomenon of agreement and its formal properties in HPSG. Kathol considers and rejects the purely semantic approach to agreement proposed in Dowty and Jacobson (1988), but his main focus is on the analytic problems in Pollard and Sag’s (1994) theory of agreement, where verbal agreement is basically regarded as an epiphenomenon of head subcategorization for certain morphosyntactic information on selected arguments, and anaphor/antecedent agreement is uniformly treated via a single mechanism, the specification of information about indices, regardless of whether the agreement is driven by pragmatic and semantic considerations or by purely formal morphosyntactic properties of the antecedent. As Kathol notes, this approach has certain serious flaws, both conceptual and empirical,
34
. .
many of which arise as consequences of making indices do virtually all of the work in linking the elements that manifest agreement dependencies. Kathol’s solution is to reintroduce the feature, originally a component of the GPSG agreement mechanism, as part of the specifications internal to an object supplied as a value for a feature , and to allow to record information relevant to the agreement pattern reflected in the data. Agreement itself then becomes a relationship not of selection, as in most of the HPSG literature, but rather one of covariation or matching between the information content of the elements in the agreement relation. Inflected forms contain information about phonological, morphosyntactic, and semantic properties of their stems, an approach suggestive of the Manning, Sag, and Iida treatment of complex predicates. In Kathol’s analysis, however, all morphologically elaborate lexical entries have this internal architecture, reflecting the inheritance of constraints associated with the sort labels attached to these morphologically complex forms via word-formation schemata. As Kathol himself emphasizes, his approach entails a view of morphology that embodies the fundamental position of Aronoff (1975), in which morphemes have no independent existence as units and simply reflect the phonological difference between simpler and more complex forms related not by concatenation but by rules mapping the morphosyntactic specifications of signs directly to phonological forms, where Kathol follows the recent work by Stump he cites on the form and operation of such rules. Indices still enter into a variety of agreement relations in Kathol’s analysis, but now the range of agreement patterns that can be unproblematically captured is greatly increased because of the possibility of more complex interactions between purely formal and more content/convention-based aspects of linguistic signs than is allowed by Pollard and Sag’s comparatively spartan approach. In particular, the morphophonological parallel between selecting and selected elements receives motivation which it lacks on the selectional view of agreement, and intricate patterns of hybrid agreement become more tractable. Kathol concludes by considering some possible ways to extend his general treatment to areas of morphosyntax that do not directly reflect agreement dependencies. The paper by Johnston, like Kathol’s, also exploits the possibility of stating far-reaching conditions on patterns of covariation in grammatical phenomena through the association of certain high-level types with general constraints that are shared by all subtypes, but the phenomena he addresses involve the interface not between syntax and morphology but between syntax and semantics. Johnston’s paper considers the distribution and interpretation of purposive adjuncts, which appear in two syntactic guises: for PPs, as in I bought a cake for Terry, and for clauses, which have a variety of shapes but which, Johnston argues, show significant parallels to the subtypes of purposive PPs that he considers. What Johnston finds striking about these parallels is that, in either PP or clausal forms, the interpretive possibilities of these adjuncts are tightly linked with certain syntactic properties of the VP whose denotation is modified by
Introduction
35
the adjunct material, and furthermore, that a correct characterization of these properties requires positing a supertype of participant role, which Johnston calls the “affected object,” that subsumes a large number of highly specific participant roles while excluding many others. The most dramatic instance of these effects is what he calls the “recipient” interpretation of purposive adjuncts, an interpretation which is unavailable unless the psoa denoted by the modified VP contains an affected agent whose index is part of the sign belonging to some syntactically overt element in the modified clause. Johnston argues that this requirement of syntactic overtness for the affected object explains why the recipient interpretation is not available when the relevant material in the main VP is gapped or this VP is replaced by do so, even though the affected object is semantically recoverable in the interpretation of such examples. Although Johnston does not present a formalization, his treatment appears to capture the affected agent requirement most naturally by taking the various kinds of purpose adjuncts he discusses as subsorts of a general sort, and then making the appearance of the recipient subsort contingent on the presence of an affected object within the VP that the adjunct modifies.37 In order for such an approach to work, of course, it is crucial to be able to identify, for each VP modified by a purposive adjunct, whether or not one of the syntactic arguments within that VP represents an affected agent. The problem, of course, is that participant roles identified in are highly specific: in a VP whose head is the verb lead, for example, the corresponding psoa will identify a relation associated with at least the roles and . In order to determine if one of these roles corresponds to an affected object, some other theory of semantic roles must be invoked that cross-cuts the large number of specific roles associated with each head’s value, and here Johnston explicitly assumes that the type hierarchy will be recruited to group together psoas containing participant roles which can license the recipient intepretation of a purposive PP adjunct.38 Johnston’s analysis thus nicely illustrates the descriptive efficacy of the parallel representation of syntactic and semantic information in the HPSG representation of linguistic expressions. And in exploiting this form of representation, it further reinforces the crucial theoretical role of the type hierarchy that assigns these expressions (including not only VP adjuncts but purposive adjuncts to NP heads) to appropriate ontological statuses bearing on their grammatical behavior.39 37
38
39
This might be implemented by linking the recipient interpretation exhibited in the value to the overt presence in one of the sign’s valence features of a value whose index is structure-shared with that of the affected object participant in . Recent work on the linkage between obliqueness relations, argument structure and the organization of the lexicon by Wechsler (1995) and Davis (1996) provides the kind of theoretical resources suitable for implementing this kind of clustering of participant roles. Of particular interest is Johnston’s account for certain subtle difference between the two purposive adjuncts in VPs and NPs respectively.
36
. .
References Aronoff, Mark. 1975. Morphology in Generative Grammar. Cambridge, MA: MIT Press. Borsley, Robert. 1987. Subjects and complements in HPSG. Technical report no. CSLI107– 87. Stanford: Center for the Study of Language and Information. Borsley, Robert. 1989. Phrase structure grammar and the Barriers conception of clause structure. Linguistics 27: 843– 863. Borsley, Robert. 1990. Welsh passives. In Celtic Linguistics: Readings in the Brythonic Languages, ed. M. J. Ball, J. Fife, E. Poppe, and J. Rowland. Philadelphia and Amsterdam: Benjamins. 186 –203. Bouma, Gosse, Rob Malouf, and Ivan Sag. 1998. Satisfying constraints on extraction and adjunction. MS, Stanford University and University of Tübingen. Bresnan, Joan. 1977. Variables in the theory of transformations. In Formal Syntax, ed. P. Culicover, T. Wasow and A. Akmajian. New York: Academic Press. 157–196. Carpenter, Robert. 1992. The Logic of Typed Feature Structures. Cambridge: Cambridge University Press (Tracts in Theoretical Computer Science). Cattell, Ray. 1978. On the source of interrogative adverbs. Language 54: 61–77. Chomsky, Noam. 1964. Current Issues in Linguistic Theory. The Hague: Mouton & Co. Chomsky, Noam. 1981. Lectures on Government and Binding. Dordrecht: Foris. Cinque, Gugliemo. 1990. Types of e-Dependencies. Cambridge, MA: MIT Press. Culicover, Peter W. 1993. Evidence against ECP accounts of the that-t effect. Linguistic Inquiry 24: 557–561. Davis, Anthony. 1996. Linking and Lexical Semantics in the Hierarchical Lexicon. Ph.D. dissertation, Stanford University. Dowty, David. 1996. Towards a minimalist theory of syntactic structure. In Discontinuous Constituency, ed. A. van Horck and W. Sijtsma. Berlin: Mouton de Gruyter. 11–62. Dowty, David, and Pauline Jacobson. 1988. Agreement as a semantic phenomenon. In Proceedings of the Fifth Eastern States Conference on Linguistics, ed. J. Powers and K. de Jong. Columbus: Ohio State University. 95–108. Gazdar, Gerald. 1981. Unbounded dependencies and coordinate structure. Linguistic Inquiry 12: 155 – 84. Gazdar, Gerald, Ewan Klein, Geoffrey K. Pullum, and Ivan Sag. 1985. Generalized Phrase Structure Grammar. Cambridge, MA: Harvard University Press. Goldsmith, Jon. 1985. A principled exception to the coordinate structure constraint. In Proceedings of the Chicago Linguistic Society 21. Chicago: Chicago Linguistics Society. 133 –143. Green, Georgia. 1995. The structure of . In Proceedings of the Fifth Annual Conference of the Formal Linguistics Society of Mid-America. Studies in the Linguistics Sciences 24, ed. James Yoon. Department of Linguistics, University of Illinois. 215 –232. Grinder, John. 1970. Super Equi-NP-Deletion. In Papers from the Sixth Regional Meeting of the Chicago Linguistics Society. Chicago: Chicago Linguistics Society. 297–317. Grinder, John. 1971. A reply to Super Equi-NP-Deletion as Dative Deletion. In Papers from the Seventh Regional Meeting of the Chicago Linguistics Society. Chicago: Chicago Linguistics Society. 101–111. Grosu, Alexander. 1972. The Strategic Content of Island Constraints. Ohio State Working Papers in Linguistics. Columbus: Ohio State University.
Introduction
37
Haegeman, Liliane. 1984. Parasitic gaps and adverbial clauses. Journal of Linguistics 20: 229 –232. Hegarty, Michael. 1991. Adjunct extraction without traces. In Proceedings of the Tenth West Coast Conference on Formal Linguistics. Stanford: Center for the Study of Language and Information. 209 –222. Hukari, Thomas E., and Robert D. Levine. 1987. Rethinking connectivity in unbounded dependency constructions. In Proceedings of the Sixth Annual West Coast Conference on Formal Linguistics. Stanford: Center for the Study of Language and Information. 91–102. Hukari, Thomas E., and Robert D. Levine. 1991. On the disunity of unbounded dependency constructions. Natural Language and Linguistic Theory 9: 97–144. Hukari, Thomas E., and Robert D. Levine. 1994. Adjunct Extraction. In Proceedings of the Twelfth Annual West Coast Conference on Formal Linguistics. Stanford: Center for the Study of Language and Information. 196–226. Hukari, Thomas E., and Robert D. Levine. 1995a. Adjunct Extraction. Journal of Linguistics 31: 195–226. Hukari, Thomas E., and Robert D. Levine. 1995b. On Termination and ValenceBased Binding Theory. Paper presented at the 1995 Linguistic Society of America meetings, 5– 8 January. Hukari, Thomas E., and Robert D. Levine. 1996. Phrase structure grammar: the next generation. Journal of Linguistics 32: 465– 496. Hukari, Thomas E., and Robert D. Levine. 1998. Subject extraction. University of Victoria/Ohio State University MS; originally presented at the 1996 International Conference on HPSG, Marseilles. Jacobson, Pauline, and Paul Neubauer. 1976. Strict cyclicity: evidence from the Intervention Constraint. Linguistic Inquiry 7: 429– 462. Kasper, Robert. 1994. Adjuncts in the Mittelfeld. In German in Head-Driven Phrase Structure Grammar, ed. Klause Netter, John Nerbonne, and Carl Pollard. Stanford: Center for the Study of Language and Information/University of Chicago Press. 39 – 69. Kathol, Andreas. 1995. Linearization-Based German Syntax. Ph.D. dissertation, Ohio State University; printed by Faculteit der Lettern, Rijksuniversiteit Groningen. Kimball, John. 1971. Super Equi-NP-Deletion as Dative Deletion. In Papers from the Seventh Regional Meeting of the Chicago Linguistics Society. Chicago: Chicago Linguistics Society. 142–148. Lakoff, George. 1986. Frame semantic control of the coordinate structure constraint. In Papers from the Parasession on Pragmatics and Grammatical Theory at the 22nd Regional Meeting of the Chicago Linguistic Society. Chicago: Chicago Linguistics Society. 152–167. Levine, Robert D., Thomas E. Hukari, and Mike Calcagno. To appear. Some overlooked parasitic gap constructions in English and their theoretical implications. In Parasitic Gaps, ed. P. Culicover and P. Postal. Cambridge: MIT Press. Liberman, Mark. 1973. Some Observations on Semantic Shape. Master’s thesis Massachussetts Institute of Technology. Moore, John. 1996. Reduced Constructions in Spanish. New York: Garland. O’Grady, William. 1987. Principles of Grammar and Learning. Chicago: University of Chicago Press.
38
. .
Pollard, Carl J., and Drew Moshier. 1990. Unifying partial descriptions of sets. In Vancouver Studies in Cognitive Science, vol. 1: Information, Language and Cognition. Vancouver: University of British Columbia Press. 285–322. Pollard, Carl J., and Ivan Sag. 1987. Information-Based Syntax and Semantics. Stanford: Center for the Study of Language and Information. Pollard, Carl J., and Ivan Sag. 1994. Head-Driven Phrase Structure Grammar. Chicago: University of Chicago Press and Stanford: Center for the Study of Language and Information. Postal, Paul M. 1994. Contrasting extraction types. Journal of Linguistics 30: 159 –186. Postal, Paul M., and Geoffrey K. Pullum. 1988. Expletive noun phrases in subcategorized positions. Linguistic Inquiry 19: 635– 670. Pullum, Geoffrey K. 1982. Free word order and phrase structure rules. Proceedings of the Northeastern Linguistics Society 12: 209 –220. Reape, Michael. 1993. A Formal Theory of Word Order: a Case Study in West Germanic. Unpublished Ph.D. dissertation, University of Edinburgh. Reape, Michael. 1996. Getting things in order. In Discontinuous Constituency, ed. Harry Bunt and Arthur van Horck. Berlin: Mouton de Gruyter. 209–253. Ross, John R. 1967. Constraints on Variables in Syntax. Ph.D. dissertation, Massachussetts Institute of Technology. Published as Infinite Syntax ! Norwood, NJ: Ablex. 1986. Sag, Ivan. 1997. English relative clause constructions. Journal of Linguistics 33: 431– 484. Shieber, Stuart. 1986. An Introduction to Unification-Based Approaches to Grammar. Stanford: Center for the Study of Language and Information. Wechsler, Stephen. 1995. The Semantic Basis of Argument Structure. Stanford: Center for the Study of Language and Information.
1. The lexical integrity of Japanese causatives . Stanford University . Stanford University Inxight Software Inc.
1
Introduction
Grammatical theory has long wrestled with the fact that causative constructions exhibit properties of both single words and complex phrases. However, as Paul Kiparsky has observed, the distribution of such properties of causatives is not arbitrary: “construal” phenomena such as honorification, anaphor and pronominal binding, and quantifier “floating” typically behave as they would if causatives were syntactically complex, embedding constructions; whereas case marking, agreement, and word order phenomena all point to the analysis of causatives as single lexical items.1 Although an analysis of causatives in terms of complex syntactic structures has frequently been adopted in an attempt to simplify the mapping to semantic structure, we believe that motivating syntactic structure based on perceived semantics is questionable because in general a syntax/semantics homomorphism cannot be maintained without vitiating syntactic theory (Miller 1991). Instead, we sketch a strictly lexical theory of Japanese causatives that deals with the evidence offered for a complex phrasal analysis. Such an analysis makes the phonology, morphology, and syntax parallel, while a mismatch occurs with the semantics. The conclusions we will reach are given in (1): 1
This paper has had a long gestation. Initial arguments for a lexicalist treatment of Japanese causatives were gathered in a seminar class run by Ivan Sag in 1990. Participants included Makoto Kanazawa, Patrick O’Neill, and Whitney Tabor. The details of the analysis were changed and a new paper written by the listed authors and O’Neill for presentation at the 1994 LSA Annual Meeting in Boston. The present version, which includes new data and extensive analytic revisions, was prepared by Manning and Sag, in regular consultation with Iida. We thank earlier contributors, and in addition are grateful to the following for comments and discussion: Emily Bender, Gosse Bouma, Ann Copestake, Kaz Fukushima, Takao Gunji, Rob Malouf, Tsuneko Nakazawa, Jerry Sadock, and Peter Sells. We’re not quite sure who should be held responsible for any remaining errors.
40
. , . ,
(1)
a. Japanese causatives must be treated as single verbal forms with complex morphological structure. The causative morpheme should not be treated as a higher predicate as it is in most transformational/GB analyses (following Kuroda 1965), and in Gunji (this volume). b. The construal phenomena that seem to motivate an analysis of Japanese causatives in terms of embedded constituent structures can be explained in terms of hierarchical lexical argument structures. c. It is possible to maintain a strictly lexical analysis, once a suitable conception of lexical structure and organization is adopted. Our analysis, which provides a simple alternative to current proposals making extensive use of verb-embedding, functional projections and empty categories, is cast within the framework of head-driven phrase structure grammar (HPSG), but is easily adapted to other lexical frameworks, such as LFG and categorial grammar, and is similar in some respects to lexical GB accounts like those offered by Miyagawa (1980) and Kitagawa (1986).
2 The data Japanese causative verbs are formed by adding -(s)ase to a verb stem, as in (2). The causer is marked with the nominative case particle ga, and the causee is marked with the dative particle ni (or optionally the accusative particle o if the stem was intransitive). (2) Yumiko ga Ziroo ni sono hon o yom-ase-ta. Yumiko Ziroo that book read-- “Yumiko made/let Ziroo read that book.” 2.1 Phonological and lexical arguments The intuition of the native Japanese speaker regarding the “wordhood” of a causative verb such as tazune-sase-ru “visit--” is clear – these verbs are single words. This intuition is supported by a number of phonological observations that have been made by Kitagawa (1986), McCawley (1968), Poser (1984), and others. We present here arguments from allomorphy and reduplication, and suggestive evidence from accentuation (for similar suggestive evidence from voicing spread and downdrift see Kitagawa 1986). 2.1.1 Allomorphy The consonant deletion that converts {\it -sase} to {\it -ase} after consonant stems: (3) a. tabe -sase -ru eat - - b. kak -ase -ru write - - is idiosyncratic rather than a general phonological rule (the general phonological rules would rather yield epenthesis, i.e. kakisaseru). This argues that -sase is lexically attached.
The lexical integrity of Japanese causatives
41
2.1.2 Reduplication Repetition of a certain action can be expressed by reduplicating the verb (4a). Such reduplication with causatives cannot exclude the verb stem (4c): (4)
a. gohan o tabe tabe rice eat eat “eating rice repeatedly” b. ?gohan o tabe-sase tabe-sase rice eat- eat- “causing someone to eat rice repeatedly” c. *gohan o tabe-sase sase rice eat-
This argues that tabe-sase must be formed in the lexicon, since reduplication is a lexical process (Marantz 1982).2 2.1.3 Accentuation Kitagawa (1986) presents a number of further arguments based on the theory of Lexical Phonology (Kiparsky 1982) that show that the past tense morpheme -ta and the desiderative morpheme -ta(i) attach to their host in the lexicon. This host can be either a verb root or the causative morpheme (among other things). For instance, observe the following pattern of accentuation (McCawley 1968, Chew 1961): (5)
a. b. c. d.
tabé-ru (eat-) tábe-ta (eat-) tabe-sasé-ru (eat--) tabe-sáse-ta (eat--)
Miyagawa (1989) and Kitagawa (1986) argue that under the theory of Lexical Phonology, these accentual alternations show that the past tense morpheme attaches lexically. On the assumption that the causative morpheme attaches to a verb stem before the final tense morpheme, then this evidence would show that the causative morpheme also attaches lexically. However, we do not view such arguments as deciding the structure of causatives. One could accept the lexical attachment of the tense and desiderative morphemes and still deny the additional assumption mentioned above. We do not know of further convincing phonological evidence for the lexical analysis of Japanese causatives beyond that presented in sections 2.1.1 and 2.1.2.3 2 3
The awkwardness of (4b) is presumably due to pragmatic factors. Other putative arguments, like noting that the accent on -másu overrides a stem affix across a causative affix, also fail because the same accentual phenomena occur with verbal compounds like yonde miru. Poser (class, Stanford, 1993) suggests as a further argument that normally any word can be an intonational minor phrase (with focus intonation) in the sense of McCawley (1968) but that -sase cannot be one. We thank Bill Poser for discussion of the phonological data.
42
. , . ,
2.1.4 Lexicalization, idioms, and blocking Miyagawa (1980, 1989) presents a variety of arguments from idioms, blocking, and idiosyncratic causatives (that have undergone semantic drift or which have survived while the base verb has disappeared) to argue for a lexical analysis of Japanese causatives. We take many of these arguments as suggestive, but not fully convincing, because there are clear cases in the literature where blocking and semantic drift occur in the syntax (e.g. Poser 1992).
2.2
Morphosyntactic arguments
A large number of morphosyntactic arguments favor the lexical analysis. 2.2.1 Subject honorification When the person denoted by the subject NP is socially superior to the speaker, the verb that governs that subject conventionally bears subject honorification morphology, o- and ni nar-, as illustrated in (6a), which involves the syntactically complex -te yaru construction.4 Only the main verb can bear subject honorification morphology in such constructions, as shown by the ungrammaticality of (6b). (6)
a. Tanaka-sensei ga kodomo ni hon o yonde o-yari ni Prof. Tanaka child book read- -give nat-ta. become- “Prof. Tanaka gave the child the favor of reading a book.” b. *Tanaka-sensei ga kodomo ni hon o o-yonde yari ni Prof. Tanaka child book -read- give nat-ta. become- “Prof. Tanaka gave the child the favor of reading the book.”
In contrast, a causative verb as a whole can bear subject honorification morphology, whereas the causative morpheme -(s)ase alone cannot bear that morphology, as shown in (7): (7)
4
a. Tanaka-sensei ga Suzuki ni hon o o-yom-ase ni Prof. Tanaka Suzuki book -read- nat-ta. become- “Prof. Tanaka made Suzuki read a book.”
We don’t gloss the word ni which appears in the subject honorific construction because we are not sure what it is. Accentuation suggests that yari is a deverbal noun, though it could conceivably be the segmentally identical verbal renyookei. It is reasonably certain, though, that the morpheme o- before yari is a prefix attached to the word yari.
The lexical integrity of Japanese causatives
43
b. *Tanaka-sensei ga Suzuki ni hon o yomi o-sase ni Prof. Tanaka Suzuki book read - nat-ta. become- “Prof. Tanaka made Suzuki read a book.” This observation argues for a lexical analysis of the causative (Sugioka 1984: 51). If the construction were syntactically complex, the honorific prefix should precede only the causative morpheme, in parallel to (6a). Put differently, in an analysis where causatives involve embedded complement clauses, it is quite mysterious how the honorific prefix o- gets to attach to the verb in the lower clause. Note finally that the other possibility, where honorification occurs inside causativization in the morphology, as in (8), provides no problems for a lexical account. For such a form, honorification occurs to the stem, and then this larger stem is causativized. The resulting pattern whereby the causee is honored falls out of the account we present below, and would be expected to fall out of almost any lexical account.5 (8)
Syukutyoku no yoomuin ga kootyoo-sensei ni yoomuin-situ night.duty janitor principal night.duty.room de sibaraku o-yasumi ni nar-ase-te sasiage-ta (koto) in a.little -rest become-- give- (fact) “The janitor on night duty let the principal take a rest in the night duty room for a little while.”
2.2.2 The double-o constraint Example (9) shows that the causative construction observes the double-o constraint (a prohibition on multiple direct objects, marked by the particle o: see Harada 1973, Poser 1989). When the embedded verb is transitive, the causee cannot be marked with accusative because this would yield two omarked NPs. (9)
Taroo ga Ziroo *o/ni Kazuo o home-sase-ta. Taro Ziro / Kazuo praise-- “Taro made Ziro praise Kazuo.”
The case marking in the morphological causative thus parallels that of the lexical causative (10a) and that of simplex three-argument verbs (10b): (10) a. Taroo ga Ziroo *o/ni e o mise-ta. Taro Ziro / picture show- “Taroo showed a picture to Ziroo.” 5
Such forms are often pragmatically awkward, however, doubtless due to the incongruity of simultaneously honoring someone and making them the causee.
44
. , . ,
b. Taroo ga Ziroo *o/ni e o age-ta. Taro Ziro / picture give- “Taroo gave a picture to Ziroo.” Only the lexical analysis predicts the case marking of causatives from the general case marking requirements for three-argument verbs without a further stipulation. 2.2.3 Nominalization Nominalizations also support the lexical approach. Suffixation of -kata creates a nominal meaning “way of,” and can apply to causatives (Saiki 1987), as illustrated in (11b). (11) a. kodomo ni hon o yom-ase-ta. child book read-- “(I) caused the child to read a book.” b. (?kodomo e no) hon no yom-ase-kata child book read--way “the way to cause (the child) to read a book” The genitive case marking on the object hon shows that yomasekata is a noun. Under a nonlexical analysis of causatives we would expect to nominalize only -(s)ase and to get accusative case o after hon. Moreover, it would be difficult to account for the accent-deleting properties of -kata, within a theory such as Lexical Phonology, unless yom-ase-kata is analyzed as a single word.6 2.2.4 Question–answer pairs A question with biclausal structure in Japanese is generally answered by repetition of the higher verb: (12) a. John ga iku yoo ni si-ta ka? John go- () do- Q “Have (you) arranged for John to go?” b. Si-ta (yo). do- “Yes, I have.” lit. “Did.” (13) a. John ni [it-te kure-ru yoo ni] tanon-da ka? John [go- give- ()] ask- Q “Have (you) asked John to go?” 6
We thank Peter Sells and Bill Poser for most of the ideas that underlie this section. It should be mentioned, though, that this argument only shows that the noun yomasekata is a word, and not necessarily that the corresponding verbal forms are, as was pointed out to us by a reviewer.
The lexical integrity of Japanese causatives
45
b. Tanon-da (yo). ask- “Yes, I have.” lit. “Asked.” But one cannot answer a question formed with a causative construction by just a causative morpheme. Rather one must repeat the whole causative form (i.e. including the putative embedded verb): (14) a. John o ik-ase-ta ka? John go-- Q “Have you caused John to go?” b. *Sase-ta. - This behavior requires a special stipulation on the nonlexical account. It is predicted if the causativized verb is treated as a lexical item. 2.2.5 Word order When a causative verb takes a theme argument and a location argument, the unmarked order is location–theme, not theme–location. For instance, in a pair like: (15) a. no ni hana o sak-ase-ru field in flower bloom-- “to cause flowers to bloom in fields” b. hana o no ni sak-ase-ru flower field in bloom-- “to cause flowers to bloom in fields” the first sentence, which has the location–theme order, is unmarked. The second sentence is somewhat less natural, and seems to be acceptable only when the location argument gets focus interpretation. This observation is unexpected under the nonlexical analysis, because it predicts that the causee argument (here, the theme) should precede all the embedded arguments in the unmarked word order, assuming that the order produced by clausal embedding is the unmarked order. In contrast, this unmarked ordering is predicted under a lexical account where it reflects the normal rules for ordering clausal constituents (Kuno 1973: 351). 2.2.6 Potential Japanese has a morpheme, -(rar)e, which adds a notion of ability or possibility to the meaning of a verb. When this morpheme is introduced into a clause, an argument which was marked in the accusative may optionally be marked with the nominative: (16) a. Mitiko wa hon o yon-da Mitiko book read- “Mitiko read the book.”
46
. , . ,
b. Mitiko wa hon ga/o yom-e-ru Mitiko book / read-- “Mitiko can read the book.” The generalization applies even to potentialized causatives (although the resulting sentences are somewhat less natural): (17) ?Taroo ga kodomo ni piano ga naraw-ase-rare-nakat-ta (koto) Taroo child piano learn---- (fact) “(the fact that) Taroo was not able to make the child learn how to play the piano” This fact would lack any natural explanation on a nonlexical analysis which treats the third NP in (17) as belonging to an embedded clause. But it follows naturally on the lexical analysis: the third NP is treated as an argument of the potentialized verb, so its case marking is predicted by the same generalization that specifies the case marking for potentialized simplex verbs, such as in (16b). 2.2.7 Negative polarity items and reciprocals It is generally accepted that the negative polarity item sika “except” can only be licensed by a negative in its own clause (Muraki 1978, Kitagawa 1986: 136).7 For instance, the following is impossible: (18) *Watasi wa [kare ga biiru sika nom-u] to sir-ana-katta. I he beer except drink- know-- “I didn’t know that he drinks anything but beer.” But note now that sika is licensed on an argument of the verb stem in a causative, even though the sentential negation occurs after (s)ase-: (19) ano ban watasi wa Taroo ni biiru sika nom-ase-na-katta. that night I Taroo beer except drink--- “That night, I made/let Taroo drink only beer” lit. “. . . not drink except beer.” This argues that a causative sentence is a single clause. Similar arguments can be made with respect to the reciprocal morpheme -a(w): see Kitagawa (1986: 174), although, as noted there, there is considerable variation in the acceptance of reciprocalized causatives.
2.3
Syntactic puzzles for a lexical analysis
Now let us turn to syntactic arguments, which are often taken to favor a nonlexical analysis. We will show that all relevant data can in fact be satisfactorily 7
Sika cooccurs with a negative verb as an NPI. It is generally translated as “only” in English in a positive sentence.
The lexical integrity of Japanese causatives
47
explained within the lexical analysis we develop. We begin with what we take to be two nonarguments, and then consider in turn data from adverb scope, apparent coordination, binding, and quantifier scope. 2.3.1 Nonarguments from anaphora and intervening particles Shibatani (1973) argues for a nonlexical analysis on the grounds that the putative pro-VP soo s- “do so” may refer to either a whole causation event or the lower predicate. However, many people have expressed skepticism as to whether soo s- is a pro-VP (Hinds 1973, Miyagawa 1980). It is not the case that soo s- always takes a VP antecedent, since the antecedent can be an event expressed by two conjoined sentences in a previous discourse: (20) A: Taroo wa Yamada-sensei ni ai ni it-ta. Taroo Yamada-teacher meet go- “Taroo went to see Prof. Yamada.” Suisenzyoo o kaite morau yoo tanon-da. recommendation write receive ask- “He asked for a letter of recommendation to be written for him.” B: Hanako mo soo si-ta. Hanako also so do- “Hanako did so, too.” This suggests that the antecedent of soo suru might better be described in terms of the cognitive structure of events than via syntactic notions of constituency. Kuroda (1981) argues for a syntactic analysis of causatives on the basis of the ability of the negative morpheme na- and certain particles such as mo “also” and sae “even” to intervene between a verb stem and what he takes to be a bare causative morpheme. However, any such argument is greatly weakened by the homonymy between the causative sase- and the form that results from adding -(s)ase to the verb stem s- “do”: s- + -(s)ase → s-ase. See Miyagawa (1989) and particularly Kitagawa (1986: 184) for evidence establishing that the allegedly problematic examples are actually manifestations of the causative of s- “do”. 2.3.2 Adverb scope Next, we consider adverb scope. Adverbs in the causative construction can in general be interpreted as modifying either the event denoted by the verb stem or the causation event (Shibatani 1990: 314). For instance, (21) is ambiguous. (21) Noriko ga Masaru ni gakkoo de hasir-ase-ta. Noriko Masaru school at run-- “Noriko made Masaru run at school” What happened at school may be either the causing event performed by Noriko or the running event caused by Noriko and performed by Masaru.
48
. , . ,
If adverb scope could be captured only by providing phrase structural domains for an adverb to take scope over, then this would be an argument for a syntactic analysis. Different interpretations could be obtained by assuring different positions for the adverb as illustrated in (22). (22) a. [Noriko ga Masaru ni [gakkoo de [[hasir]-ase]]] b. [Noriko ga Masaru ni [[gakkoo de [hasir]]-ase]] On this view, the ambiguity of adverb scope is attributed to the presence of an embedding structure, i.e. the presence of two sentential domains over which adverbs can take scope. Some authors have suggested that, as a result, certain adverb positions have unambiguous scope readings, as shown in (23). (23) a. Taroo ga damatte Hanako o heya ni hair-ase-ta. Taroo silently Hanako room into enter-- “Taroo made Hanako enter the room silently.” [unambiguous] (Miyagawa 1980) b. Damatte Taroo ga Hanako o heya ni hair-ase-ta. Silently Taroo Hanako room into enter-- “Taroo made Hanako enter the room silently.” [unambiguous] (Miyagawa 1980) While a full account of different scope preferences for adverbs is beyond the scope of this paper, we note that various proposed structural restrictions on scope have been contested (e.g. by Kitagawa 1986: 89), and in particular there exist sentences such as those in (24) in which the adverb appears in structurally the same position as in (23a), but where it can clearly modify either the causation event or the caused event. We will take it as our goal to allow both scopal possibilities for all adverb positions within the clause. (24) a. Ken ga hitori de Naomi ni hon o yom-ase-ta. Ken by oneself Naomi book read-- “Ken made Naomi read the book by herself.” “Ken made Naomi read the book all by himself.” b. Ken ga damatte Naomi o suwar-ase-ta. Ken silently Naomi sit-- “Ken (silently) made Naomi sit (silently).” c. Ken ga zibun no pen de Naomi ni sakubun o Ken self pen with Naomi composition kak-ase-ta. write-- “Ken (with his own pen) made Naomi write a composition (with her own pen).”
The lexical integrity of Japanese causatives
49
2.3.3 Coordination It is sometimes assumed that examples like (25) involve coordinate structures, even though there is no overt coordinating particle. (25) Ken wa Naomi ni [[hurui kutu o sute]-te [atarasii kutu o Ken Naomi old shoes throw- new shoes kaw]] -ase-ta. buy - “Ken made Naomi throw away her old shoes and buy new ones.” Given this assumption, the intended reading suggests, as noted by Gunji (1987), that the VPs hurui kutu o sute and atarasii kutu o kaw are conjoined and -sase is attached to this complex VP. These sentences, however, cannot provide strong evidence for any nonlexical analysis because the phrases containing a gerundive verb (sutete) should be considered as adverbial phrases, rather than as conjoined VPs.8 Sentence (26) shows that the phrase “throw away old shoes” is indeed acting as an AdvP because, as an adjunct, it can be placed inside the middle of the other supposed conjunct.9 (26) Ken wa Naomi ni atarasii kutu o [hurui kutu o sute-te] Ken Naomi new shoes old shoes throw kaw-ase-ta. buy-- “Ken made Naomi throw away old shoes and buy new shoes.” Asymmetries in the desiderative ga/o alternation with these putative “coordinated VPs,” as in (27), provide further support for our claim (Sugioka 1984: 168). (27) a. *Boku wa [kootya ga non-de], [keeki ga tabe]-tai. I tea drink- cake eat- “I want to drink tea and eat cake.” b. *Boku wa [kootya ga non-de], [keeki o tabe]-tai. I tea drink- cake eat- c. ?Boku wa [kootya o non-de], [keeki ga tabe]-tai. I tea drink- cake eat- d. Boku wa [kootya o non-de], [keeki o tabe]-tai. I tea drink- cake eat- These asymmetries can be explained by assuming that the first apparent VP is actually an AdvP, and that therefore the case marking of the first object (kootya “tea”) cannot be affected by properties of the suffix -tai. 8 9
We thank Michio Isoda for some of the ideas behind this section. Some speakers appear to rate this sentence as deserving a “?” in front, while others regard it as fine. At any rate, this situation contrasts clearly with real conjunction.
50
. , . ,
Third, note the behavior of relativization: (28) a. [Ken ga Naomi ni [hurui kutu o sute-te] kaw-ase-ta] Ken Naomi old shoes throw buy-- atarasii kutu new shoes “the new shoes which Ken made Naomi throw away old shoes and buy” b. *[Ken ga Naomi ni [sute-te] atarasii kutu o kaw-ase-ta] Ken Naomi throw new shoes buy-- hurui kutu old shoes “the old shoes which Ken made Naomi throw away and buy new ones” The linearly second object (“new shoes”) can be relativized as in (28a), while the first object (“old shoes”) cannot (28b). If (28a) were actually a case of coordination, then it should be bad as a violation of the Coordinate Structure Constraint. We hasten to add that the same asymmetries are found with renyookei “coordination” as well. The desiderative alternation is illustrated in Sugioka (1984: 168), and the same relativization facts hold as above. Our consultants judge scrambling with renyookei “coordination” less acceptable than with -te form “coordination,” but not impossible. We have no explanation for this at present. 2.3.4 Binding Binding facts are used as further syntactic evidence to support a nonlexical analysis (Kuroda 1965). It has been widely accepted in the literature that zibun (“self ”) is a subject-oriented reflexive. The fact that causee arguments can antecede reflexives as shown in (29) appears to support the embeddingstructure analysis of causatives: zibun-binding to the cause Taroo is possible because Taroo is the embedded complement subject. (29) Hanako ga Taroo ni zibun no syasin o mi-sase-ta. Hanako Taroo self picture see-- “Hanakoi made Tarooj see heri/hisj picture.” In contrast, the standard judgment is that there is no ambiguity in (30) where the lexical causative form miseru (“show”) is used.10 10
This conclusion is questioned in some work such as Momoi (1985) and Iida (1992), but we will accept it here.
The lexical integrity of Japanese causatives
51
(30) Hankao ga Taroo ni zibun no syasin o mise-ta. Hanako aro self picture show- “Hanako showed Taroo her/*his picture.” However, as Iida (1992, 1996) has shown, there are good reasons to question the subject-based account of zibun-binding. There are many clear counterexamples such as those in (31): (31) a. Zibun no buka no husimatu ga Taroo no self subordinate misconduct Taroo syusse o samatage-ta. promotion mar- “The misconduct of hisi subordinate marred Tarooi’s promotion.” b. Taroo wa Zirooi ni zibuni no ayamati o satosi-ta. Taroo Ziroo self mistake make-realize- “Taroo made Zirooi realize hisi mistake.” But even assuming the subject-based generalization is basically right, it is possible to account for the zibun-binding facts without assuming an embedded constituent structure. Within HPSG, binding theory is universally based on argument structure, and hence the subject-orientation of zibun-binding need not be stated in terms of constituent structure at all. We return to this matter in section 4.2.1. Both the overt pronoun kare (“he”) and the zero pronoun (“little pro”) are regarded as pronominal elements and subject to Principle B, as shown in (32): (32) *Tarooi wa Hanako ni karei o/øi sarakedasi-ta. Taroo Hanako he /pro reveal- “Tarooi revealed himi to Hanako.” However, in the morphological causative construction, as shown in (33), kare and the zero pronoun in the lower object position may be bound by the subject, but must be disjoint in reference with the dative causee (Kitagawa 1986, Shibatani 1990). (33) a. Tarooi wa Zirooj ni karei/*j o bengo s-ase-ta. Taroo Ziroo he defense do-- “Tarooi made Zirooj defend himi/*j .” b. Tarooi wa Zirooj ni øi/*j bengo s-ase-ta. Taroo Ziroo defense do-- “Tarooi made Zirooj defend himi/*j .” These facts have also been used as evidence to support the embedded analysis of the morphological causative.
52
. , . ,
Although kare exhibits various peculiarities that challenge its traditional classification as a simple pronominal,11 we will nonetheless assume here that it falls within the scope of Principle B, and seek to explain this behavior, too, in terms of an argument-structure-based theory of binding. 2.3.5 Quantifier scope Finally, we consider a problem about quantifier scope similar to that posed by the interaction of adverbs and causatives. A quantified NP functioning as the lower object of a causative verb form can take intermediate scope, i.e. can take scope over the verb stem, but be outscoped by the causative operator, as illustrated in (34). (34) Tanaka-sensei ga gakusei ni sansatu hon o sirabe-sase-ta. Prof. Tanaka student three book check-- “Prof. Tanaka made the student check three books.” Perhaps clearer examples of ambiguous scopal interpretation involving the quantifier particle sika “except” (recall section 2.2.7) are discussed by Kitagawa (1986: 138). Sentence (35a) can mean either (i) only with respect to beer, I brought about a situation such that Taroo drank it (not the whiskey, etc.) or (ii) I brought about a situation such that Taroo would drink only beer (and no whiskey, etc.), and a similar ambiguity exists in the interpretation of (35b). (35) a. ano ban watasi wa Taroo ni biiru sika nom-ase-na-katta. that night I Taroo beer except drink--- “That night, I made/let Taroo drink only beer” lit. “. . . not drink except beer.” b. Rupan wa tesita ni hooseki sika nusum-ase-na-katta. Lupin follower jewelry except steal--- “Lupin made/let his followers steal only the jewelry.” In light of these observations, it is essential that any lexical account of causatives make clear how it can deal with such ambiguous scope assignments. Under the assumption that the causative is a single lexical entity, the problem posed by such examples is basically the problem of how to assign 11
For example, kare does not serve as a bound variable: kare does not refer to the quantified subject NP in (i) and (ii). (i) ?*dono otokoi mo karei no tomodati o hihan si-ta. which man also he friend criticism do- “Every mani criticized hisi friend.” (ii) *dono otokoi mo [Masaru ga karei o hometa]koto ni odoroi-ta. which man also Masaru he praised be surprised- “Every mani was surprised at the fact that Masaru praised himi.” Furthermore, as Takubo (1990) observes, kare can only refer to a person whose identity has been established in the speaker’s knowledge.
The lexical integrity of Japanese causatives
53
“word-internal” scope to a quantified NP that appears external to the lexical causative. The account must predict that a quantified argument of the causative verb can be interpreted as having narrow scope with respect to the causative operator, even though there is no syntactic constituent to serve as the basis of that particular scope assignment.
3
Background and basics of the analysis
3.1
Essentials of HPSG
Our general proposal for a lexical treatment of -sase causatives is compatible with a variety of lexicalist frameworks. The crucial ingredient we need is a theory of word formation that allows constraints to apply to the argument structures of both the causative verb as a whole and also the stem to which the causative suffix is added.12 The conception of argument structure that we employ is based on essentially the same notion of lists as that used by Pollard and Sag (1987) and Gunji (1987). However, following recent work in HPSG,13 we distinguish argument structure (-) from a word’s valence, which is specified in terms of the features (), (), and (). Canonically, the values of a word’s valence features “add up” (via list concatenation [or the “append” relation]) to the verb’s - value, as illustrated for the English words in (36).14 (36) a. buys -
verb[ fin] 〈1NP[N]3s〉 〈2NP〉 〈1, 2〉
b. picture -
noun 〈1det〉 〈 (2PP[of ]) 〉 〈1, 2〉
In this theory, it is the valence features (not -) whose values are “cancelled off ” (in a categorial grammar-like manner) as a head projects a phrase. A lexical head combines with its complements and subject or specifier (if any) according to the lexically inherited specification, as shown in (37). 12
13 14
The notion of argument structure draws from related work in many frameworks, for instance Kiparsky (1987), Rappaport and Levin (1988), Bresnan and Zaenen (1990), Grimshaw (1990), Alsina (1993), and Butt (1993). Our conception of argument structure is developed more fully in Manning and Sag (1998). Let us merely note that in this work argument structure has the following three properties: (1) it is a syntactic construct that is crucially distinct from semantic structure (Manning 1994), but systematically related to it (Davis 1996); (2) it is associated only with lexical signs, not phrases ; and (3) it is the locus of binding theory. Borsley (1989), Pollard and Sag (1994: ch. 9), Miller and Sag (1997), Abeillé and Godard (1994). Here and throughout, we are ignoring the details of the feature geometry of HPSG signs, displaying only those features that are of direct relevance. We return below to the issue of argument conservation, i.e. the relation between the values of valence features and argument structure.
. , . ,
54
(37)
S 3 〈〉 〈 〉 2NP[N] 〈〉 〈 〉
Sandy
VP 3 〈2〉 〈 〉
-
V 3 〈2〉 〈4〉 〈2, 4〉
buys
4NP[A] 〈 〉 〈 〉
the picture
Unlike English, we assume for Japanese that subjects and complements can be cancelled in any order and in any quantity, predicting clause-bounded scrambling.15 The - list remains unaffected in the construction of syntactic phrases, except that, in virtue of the various identities between - members and members of valence lists, the - list’s members become fully specified as the valence list values are identified with actual subjects, complements, and specifiers. Once a complete phrase is constructed, the lexical head’s - list is a fully specified hierarchical argument structure. As we will see below, it is the - list that is the locus of binding theory.
3.2
Lexical organization and morphology
Basic lexical entries, which we may think of as morphological stems, give rise to further forms through the application of morpholexical processes of various kinds. A number of techniques have been developed for the description of complex morphological forms within lexicalist frameworks, including the lexical rules approach sketched in Pollard and Sag (1987) and Flickinger (1987), a type-based treatment of lexical rules developed by Copestake (1992), and the “type-based” approach to morphology developed by Riehemann (1993, 1995). Our basic analysis of Japanese causatives is compatible with any of 15
Alternatively, following Kathol (1995), Japanese subjects and complements belong to a single ordering domain, which sanctions essentially the same word order freedom in virtue of the paucity of Japanese linear precedence constraints.
The lexical integrity of Japanese causatives
55
these approaches, but we will here develop our account in terms of a theory of derivational types, which specify a declarative relationship between a stem and a stem (which is morphologically “derived” from it). Such an approach is closely related to what Copestake proposes (see also Meurers 1995). It has the advantages of allowing inheritance within the hierarchical lexicon of HPSG to extend over both stem and word types and derivational types (as in Riehemann’s approach) while preserving the locality of information and lexical integrity of words within the syntax that is wellcaptured within the lexical rules approach. The first point means that all stem, word, and derivational types are organized into a hierarchy of types, each of which is associated with appropriate constraints. Extending the type hierarchy over derivational types and their result types more easily allows the various patterns of causatives and their linking patterns to be expressed. The second point implies that the formalism allows only a constrained correspondence between two stems, and hence entails a certain notion of locality. Only information specifically carried over from input to output by the rule is visible in the context where the causative stem occurs, and the syntax has no other access to the derivational history of a word. That is, we assume that the basic lexical entry for the stem buy need stipulate only the information shown in (38): (38) buy: v-stem & strict-trans buy-rel where v(erb)-stem and strict-trans(itive) are distinct types associated with the constraints illustrated in (39): (39) a. strict-trans: [- 〈NP, NP〉] b. v-stem: [ verb] Moreover, in the spirit of Wechsler (1995) and Davis (1996), we will assume that the projection of semantic roles to syntactic argument structure is mediated by general principles also formulated as constraints on lexical types. First, we assume, following Davis, that buy-rel is a subtype of act(or)-und(ergoer)-rel. This leads to the attributes and being appropriate for buyrel, and this classification, together with inheritance of the constraints in (39), means that the stem buy inherits all the information shown in (40): (40) buy: strict-trans verb - 〈NP, NP〉 buy-rel [] [ ] The classification of buy-rel as a subtype of act-und-rel is also the key to explaining its argument projection properties. Because of the general relation
56
. , . ,
(a subsumption-preserving homomorphism) that Davis establishes between stem types and types of semantic relation, it follows that any stem like buy must obey the constraints established for superordinate stem types. To see this, let us examine the case of buy a bit more closely. Davis posits stem types like those shown in (41).16 (41) a. actor-stem:
act-rel i - 〈NPi , . . . 〉 und-rel b. undergoer-stem: j - 〈 . . . NPj , . . . 〉
Because buy-rel is a subtype of act-und-rel, which in turn is a subtype of both actor-rel and undergoer-rel, the strong correspondence between stem types and relation types requires that the stem buy must also be a subtype of both stem types in (41). Thus the stem buy must also inherit the constraints associated with those types. Unifying the constraints in (41) with the information in (40), we derive the correct linking pattern for buy, as shown in (42). (42) buy: strict-trans verb - 〈NPi , NPj 〉 buy-rel
i j
The canonical relation between - and features is also determined by a general type constraint, namely the constraint on the type stem. (43) stem: 1 compression (2) - 1 % 2 Here % designates the operation of list concatenation (or append). For the moment, we may assume that compression is just the identity function, and the constraints of this type just cause the - to be the list concatenation of the and lists (as illustrated earlier). An independent constraint guarantees that a stem’s value is a singleton list. Thus, because strict-trans is a subtype of stem, buy must inherit the information in (43) as well. Hence, in virtue of the system of lexical types and the associated type constraints,
16
Davis’s work follows a tradition pioneered in particular by Gawron and Wechsler, incorporating certain specific semantic analyses proposed by Pinker, and adapting ideas of Jackendoff. For an overview of the history of these ideas, see Davis (1996).
The lexical integrity of Japanese causatives
57
the minimal lexical entry for the stem buy given in (38) above is sufficient to guarantee that buy actually contains all the information in (44). (44) buy: strict-trans -
verb 〈1〉 〈2〉 〈1NPi , 2NPj〉 buy-rel i j
This result is obtained in a principled, deductive fashion from constraints of considerable generality. In section 5, we will extend this treatment to include a lexical account of quantifier scoping as well.
3.3
Causative stems
Causative stems bear a systematic phonological, syntactic, and semantic relation to the verb stems from which they are formed. The information that must be specified within any analysis of Japanese causative stems is the following: (45) a. -(s)ase is suffixed in the , b. the stem’s is embedded as the argument of the derived form’s , which is a ternary cause-rel relation, c. the derived form’s argument structure contains a causer subject and a causee complement (inter alia). Our intention is to account for these properties in terms of a single derivational type, caus(ative)-drv, the grammatical constraints particular to that type, and their interaction with constraints on other related lexical types. We posit only the following constraints as particular to the type caus-drv: 17 (46) caus-drv:
caus-stem Fsase (1) cause-rel 3 v-stem 1 3
First, let us consider the linking properties of causatives. The type causerel (like buy-rel ) is a subtype of act-und-rel. Hence (by the same reasoning 17
The function Fsase (X) yields X+sase, if X is vowel-final, and X+ase otherwise.
58
. , . ,
outlined in the previous section) the relation/stem correspondence ensures that caus-stem is a subtype of both actor-stem and undergoer-stem, which in turn entails that the first - member is linked to the causer () and the second - member to the causee (), as shown in (47): (47) caus-stem: Fsase (1) - 〈NPi , NPj , . . . 〉 cause-rel i j 3 As for the rest of the causative stem’s -, we will assume that this is a list consisting of just the - value of the noncausative stem, itself a list. The causative’s - value is thus a “nested” list (a list that contains another list as a member), a fact that will play a crucial role in our account of constraints on binding. On our analysis, causatives acquire such nested argument lists in virtue of the fact that caus-drv is a subtype of another type that we will call complex-pred(icate)-drv. A first version of the constraints on the type complexpred-drv (in Japanese) are the following:18 (48) complex-pred-drv: [- 〈1, 2, 4〈PRO, . . . 〉〉] [- 4] “PRO” here designates a special type of element that is associated with the subject of the basic stem. PRO is coindexed with some member of the (outer) - list in accordance with fundamentally semantic principles similar to those outlined for English control constructions in Sag and Pollard (1991) (see Davis 1996). At least for Japanese causatives, though perhaps not for all instances of the type comp-pred, it is the second - member (the causee) that is coindexed with PRO. Note that PRO is never an overt subject or complement. Because of the list embedding in (48), we must modify our account of the linking relation between - and . This is where the function compression is needed. The idea is still that the and lists add up to the argument structure, but we need to remove the embedded lists and PRO elements from the argument structure. Informally, what compression will do is flatten out embedded lists in the - list, promoting their members to be on a par with the other list members and deleting embedded PROs in the process (hence the name compression).19 18 19
We will later revise this to incorporate our account of lexicalized quantifier scoping. The function compression can be defined as follows (“←” designates “only if ”): (i) compression(〈 〉) = 〈 〉. (ii) compression(〈PRO |Y 〉) = Z ← compression(Y ) = Z.
The lexical integrity of Japanese causatives
59
With this revision in place, we can now see how the constraints illustrated in this section and the previous one interact to guarantee that the causative formed from the stem kaw- “buy” has all the properties illustrated in (49):20 (49) kawase- “cause to buy” caus-stem verb 〈1NP[]i〉 〈2NP[]j , 3NP[]k 〉 cause-rel i j buy-rel - 〈1i, 2j , 〈PROj , 3k〉 〉
j k
Stems like the one in (49) may be used as the basis for building the inflected words that serve as syntactic heads. Inflection does not alter the valence, argument structure, or semantic content in relevant ways, however. Thus, the information sketched in (49) corresponds in all relevant respects to the information borne by a causative verb when it functions as the lexical head of a syntactic phrase, combining with its complements according to the same principles that govern the combination of noncausative heads. Let us now see how this analysis can be applied first to adjuncts and the alleged coordination facts, and then to issues of binding and scope.
4
Analysis
4.1
Adjunct scope and “coordination”
As we have seen, the verbal ending -te marks phrases that are better analyzed as adverbials, not conjuncts. We provide a uniform treatment of scope that covers the interpretation of adverbs, putative coordination, and a number of related issues. The analysis we will sketch, if nothing more is said, entails that adverbs will be added to valence lists freely and hence, given our assumptions about scrambling, freely ordered among other complements.
20
(iii) compression(〈X |Y 〉) = 〈X |Z 〉 ← X is a synsem, compression(Y ) = Z. (iv) compression(〈X |Y 〉) = Z ← X is a list, compression(X ) = X ′, compression(Y ) = Y′, append(X′, Y′) = Z. We leave a number of matters unresolved here. For case assignment, we assume general case assigning rules for Japanese (which may make reference to structural, lexical, or semantic features), but do not attempt to develop them here.
. , . ,
60
The essence of our proposal is a “zero derivation” type that adds an adjunct onto a verb stem’s - list (and hence onto its list). We couch this proposal in terms of the derivational type a(dverb-)t(ype-)r(aising)-drv sketched in (50), which encodes a kind of type-raising, a function-argument reversal commonly utilized within categorial grammar.21 (50) atr-drv
atr-stem - -
4 1 % 〈ADV[ 3]〉 3 [ 2] 4 1 2
Stems resulting from this type have a semantic content that is based on the adverb complement it will combine with, at the same time making the content of the value (the stem from which the atr-stem is “derived”) the argument of that adverbial. Since atr-stem is a subtype of canon-stem (see above), it also follows that the adverb is the last element of the atr-stem’s list. Note that the definition of compression given in note 19 interacts with the constraints in (50) to ensure that the and lists of a causative stem are correctly treated. For instance (51) shows the type that results when the basic verb stem kaw- “buy” is first causativized and then undergoes adverb type raising (hence giving the selected adverb wide scope): (51)
atr-stem -
21
kaw-ase 〈8NP[]k〉 〈9NP[]i, 2NP[]j , 3ADV[ 5]〉 〈8, 9, 〈PROi, 2〉, 3〉 adverb-rel cause-rel k 5 i buy-rel i j
Our analysis differs from categorial analyses in that it employs a highly restricted, lexically governed version of type raising. Its work is done before other arguments are combined with the raised functor. Nonetheless, all uses of our rule correspond to theorems of the Lambek calculus. Similar proposals for adverbial type raising in HPSG are made for French by Abeillé and Godard (1994), for Dutch by van Noord and Bouma (1994), and for English by Kim and Sag (1995).
The lexical integrity of Japanese causatives
61
But of course it is the possibility of the adverb modifying within the scope of cause-rel that is more challenging for a lexicalist theory. So consider again the conjunctive adverbial in (52). (52) Naomi wa hurui kutu o sute-te atarasii kutu o kat-ta. Naomi old shoes throw- new shoes buy- “Naomi threw away her old shoes and bought new ones.” The canonical stem kaw1 discussed in the previous section gives rise through atr-drv to a phonologically indistinguishable counterpart kaw2 of type atr-stem that must combine with an adverbial complement, as sketched in (53). (53) kaw2: atr-stem -
kaw 〈1NP[]i 〉 〈2NP[]j , 3ADV[ 5]〉 〈1, 2, 3〉 adverb-rel buy-rel 5 i j
It is this stem that gives rise to the inflected form kat-ta that occurs in (52). kaw2 can also give rise via caus-drv to a causative stem, as shown in (54): (54)
caus-stem -
kaw-ase 〈8NP[]k〉 〈9NP[]i, 2NP[]j , 3ADV[ 5]〉 〈8, 9, 〈PROi, 2, 3〉 〉 cause-rel k i adverb-rel buy-rel 5 i j
And it is tensed verbs formed from this stem that are the basis for the narrow scope reading of causative structures like (55), as illustrated in (56). (55) Ken wa Naomi ni hurui kutu o sute-te atarasii kutu o Ken Naomi old shoes throw- new shoes kaw-ase-ta. buy-- “Ken made Naomi throw away her old shoes and buy new ones.”
62
. , . ,
(56)
S 〈 〉 〈 〉 NP[] 〈 〉 〈 〉
VP 〈NP[]〉 〈 〉
Ken wa NP[] 〈 〉 〈 〉
VP 〈NP[]〉 〈NP[]〉
Naomi ni AdvP hurui kutu o sute-te
VP 〈NP[]〉 〈NP[], AP〉
NP[] 〈 〉 〈 〉 atarasii kutu o
V 〈NP[]〉 〈NP[], NP[], AP〉 kawaseta
Note that the adverbial phrase in this example appears higher in the tree than the causative verb, but nonetheless modifies only the verbal stem kaw. Given that the modification relations are fixed by the lexical entries and the phrases they project, the same interpretation results from a scrambled example such as (57): (57) Ken wa Naomi ni atarasii kutu o [hurui kutu o sute-te] Ken Naomi new shoes old shoes throw- kaw-ase-ta. buy-- “Ken made Naomi throw away old shoes and buy new shoes.” Allowing type-raising predicates to place adverbs on their - list in this way thus provides a straightforward account of both adverbial scope possibilities, and of the ability of -te phrases to scramble.
4.2
Binding theory
4.2.1 Reflexives The HPSG binding theory is based on hierarchical argument structure rather than constituent structure. As Pollard and Sag (1992, 1994) demonstrate, this
The lexical integrity of Japanese causatives
63
approach to binding provides an immediate solution to a variety of problems facing accounts of English binding stated purely in terms of constituencybased notions such as c-command. Our account of binding in Japanese is based on principles identical to those posited for English by Pollard and Sag, augmented by a new principle for long distance anaphors, such as Japanese zibun, identical to that proposed for Mandarin by Xue, Pollard, and Sag (1994). These principles are stated informally in (58). (58) HPSG Binding Theory: Principle A. A locally o-commanded anaphor must be locally o-bound. Principle B. A personal pronoun must be locally o-free. Principle C. A nonpronoun must be o-free. Principle Z. A long distance anaphor must be o-bound. The effect of these principles is to require an anaphor to be coindexed with a less oblique - member, if there is such a less oblique coargument. Otherwise, anaphors are free (subject to various discourse and processing considerations) to refer to appropriate elements in the discourse context. The Japanese reflexive zibun is clearly long distance, and hence properly governed by Principle Z. However, as we saw earlier, its antecedence is usually restricted to subjects. Manning (1994) argues that the correct constraint in these cases is the principle in (59): (59) A-subject Principle: Some anaphors must be bound by an entity that is first on some - list. Kitagawa (1986), citing unpublished work by K. Kurata, has argued that the expressions mizakara “self ” and zibun-zisin “self ” are true anaphors that obey Principle A. However, there are reasons to be skeptical of this claim. There are numerous counterexamples to the putative generalization that mizukara “self ” and zibun-zisin “self ” must have a local binder, as illustrated by the following examples: koto ga Tarooi o (60) a. Zibun-zisini ga hihan s-are-ta self criticism do-- Taroo nayamase-te iru. bother-- (lit.) “The fact that self was criticized bothers Taroo.” b. kono hoosiki no moto-de wa, wakai toki-ni zibun-zisini ga this system under young when self siharat-ta kingaku ga yokinsyai no nenkin ni tuika deki-ru. pay- amount depositer pension to add can- “Under this system, the amount that a depositer paid at his younger age can be added to his pension plans.” These might be explained away as “exempt” anaphors, that is as anaphors that, because they lack a local o-commanding element (nothing outranks a subject
64
. , . ,
in an - list), are not constrained to be locally o-bound. This approach is possible in the HPSG binding theory (see the formulation of Principle A given above), but not in other binding theories we are familiar with. However we doubt that this kind of analysis is sufficient to explain examples like the following, where zibun-zishin is locally o-commanded, but not locally o-bound. no ii (61) a. Tarooi wa tomodatij ni zibun-zisini/j ni tugoo Taroo friend self circumstances good syoogen o s-ase-ta. testimony do-- “Tarooi made his friendj give evidence convenient for himi/j .” b. Tanaka-kyoozyui wa [gakusei ga gakkoo-tookyoku dake Tanaka-professor student school-authorities only de-naku zibun-zisini ni mo sinrai o oi-te i-nakat-ta] be- self on even reliance place--- noni gakuzen to si-ta. since shocked do- “Prof. Tanakai got shocked at the fact that the students didn’t rely on not only the school authorities but also himi.” Moreover, as suggested to us by Takao Gunji (p.c., July 1993) it may simply be the emphatic nature of these expressions that makes them tend to prefer a local antecedent (at least in simple examples) without their actually being subject to Principle A. This line of reasoning, quite like that followed by Iida (1992) in her account of zibun-binding, seems more likely to provide a systematic account of the entire range of observations about zibun-zishin-binding. Thus we will tentatively regard both zibun and zibun-zisin as subject to Principle Z and the A-subject Principle. We may now examine the predictions made by our lexical analysis of causatives. Recall that in this analysis, the - list of the lower verb is embedded in the causative verb’s - list, as illustrated in (62). (62)
caus-stem -
hihan sase 〈1〉 〈2, 3〉 〈1NPi, 2NPj , 〈PROj , 3NPk〉 〉 cause-rel i j criticize-rel j k
The lexical integrity of Japanese causatives
65
In (62), the lower object 3 appears on the embedded - list of the verb. Thus according to the binding theory in (58), if the lower object is a true anaphor, then it is locally o-commanded and can be locally o-bound by only one element – PRO (coindexed with the causee). This prediction contradicts the claims made by Kitagawa (1986) and Yatabe (1993) about the ambiguity of examples like (63). zibun-zisini/j o hihan (63) Tarooi ga Zirooj ni aete Taroo Ziroo purposefully self criticism s-ase-ta. do-- “Tarooi purposefully made Zirooj criticize himselfi/j .” (Kitagawa 1986: (92)) Both of these researchers assume that the grammar of causatives must be reconciled with this ambiguity by somehow providing two domains in which the anaphor can be bound. However, on the assumption that zibun-zisin is not a true anaphor, but a long distance anaphor subject to Principle Z and the A-subject constraint (requiring that the binder of zibun-zisin be the first element of some argument structure list), the ambiguity of (63) is unproblematic. On this theory zibun-zisin is free to be bound by any o-commanding a-subject, and so we predict that it can be bound by either of the higher a-subjects, Taroo or Ziroo. 4.2.2 Pronouns Now let us consider again the pronominal coreference facts shown in (64). (64) Tarooi wa Zirooj ni {kare o /ø}i/*j bengo s-ase-ta. Taroo Ziroo him /pro defense do-- “Tarooi made Zirooj defend himi/*j .” The zero pronoun, or kare, in the lower object position allows the surface subject, but not the lower subject (the causee), as it antecedent. Here, again, we find confirmation of our nested argument structure analysis. The observed facts follow immediately from the nested - analysis and the assumption that missing arguments and kare are both pronominals. Considering again (62), we see that coindexation of the subject and the lower object is possible, because there is no - list where both elements occur. However, the lower object cannot be coindexed with the causee, because the causee shares an index with the lower subject, hence indexing the lower object in this way would make the lower object locally o-bound, in violation of Principle B. Hence, by assuming simply that Japanese has pronominal arguments, we can use the very same binding principles that have been applied to English and other languages. Principle B rules out coreference between the lower
66
. , . ,
object and the causee, but nothing blocks coreference between the lower object and the subject because the surface subject isn’t on the embedded - list. 4.2.3 Adverbial -nagara clauses This understanding of a-subjects, together with the preceding account of adjuncts is also the basis of the treatment we would give of adverbial -nagara clauses, which can be placed freely in a sentence like other adverbials, and which can be controlled by any a-subject, but not other noun phrases, as is shown for causatives in (65): (65) Taroo wa kodomotati ni utai-nagara tegami o kak-ase-ta. Taroo children sing-while letter write-- “Tarooi made the childrenj write a letter while hei/theyj sang.”
4.3
Quantifier scope
Quantified NPs pose a problem similar to that of adverbs: a quantified NP functioning as the lower object of a causative verb form can take intermediate scope, i.e. although external to the lexical causative, it can scope over the verb stem, but within the scope of the causative operator, as illustrated in (66). (66) Tanaka-sensei ga gakusei ni sansatu hon o sirabe-sase-ta. Prof. Tanaka student three book check-- “Prof. Tanaka made [the student check three books].” In order to deal with this matter, we must first enter into a slight digression about the treatment of quantifier scope in HPSG. 4.3.1 Quantifier scope in HPSG The theory of quantifier scope presented in chapter 8 of Pollard and Sag (1994) (P&S) is based on the technique of quantifier storage pioneered in Cooper (1983). “Cooper storage” is a method allowing a variable to go proxy for a quantifier’s contribution to the content of a sentence, while the quantifier which binds that variable is placed in a “store.” Stored quantifiers are gathered up from the daughters of a phrase and passed up to successively higher levels of structure until an appropriate scope assignment locus is reached. There quantifier(s) may be retrieved from storage and integrated into the meaning, receiving a wide scope interpretation, as illustrated in (67) in terms of the HPSG features () and ():
The lexical integrity of Japanese causatives
(67)
67
S {} every-memo-j some-person-i read(i,j)
S {every-memo-j} some-person-i read(i,j)
NP [ some-person-i]
somebody
V {} read(i,j)
VP {every-memo-j} read(i,j)
reads
NP {every-memo-j} j every memo
On P&S’s version of Cooper’s theory, is specified for two attributes: () and (), the former taking a list of generalized quantifiers as its value, the latter taking what we have here treated as relations. On their theory, all quantifiers “start out” in storage, and retrieval (removal of some set of quantifiers from the set and appending of some ordering of the removed set to the head’s list) is allowed at higher levels of structure, subject to various constraints. This means that the scope assigned to a quantifier can in principle be any higher semantic domain, i.e. any semantic domain containing the semantics of the minimal clause containing the quantified NP. P&S’s version also differs from Cooper’s in eliminating the nonbranching structure (the “S-over-S” structure in (67)) associated with retrieval. However, the theory presented by P&S has at least one serious defect22 – its failure to provide for the possibility that in raising or extraction constructions, a quantifier may have scope corresponding to a lower syntactic position. As is well known, a sentence like (68), for example, allows a de dicto reading where the matrix subject takes narrow scope with respect to seems : (68) A unicorn seems to be approaching. “It seems that there is a unicorn approaching.” In recent work, Pollard and Yoo (1998) suggest the beginnings of a solution to this problem. First, they propose to make a feature of local objects, 22
Exactly the same defect as Montague’s (1974) “proper treatment of quantification,” incidentally. We thank Bob Carpenter for pointing out some of the problems in the P&S theory of quantification.
68
. , . ,
rather than a feature of the highest level of grammatical structure (the sign), as P&S proposed. This revision has the consequence that within raising and extraction constructions, the stored quantifiers are identified. That is, the value of the subject of seems in a cascaded raising structure like (68) is also the value of the (unexpressed) subject of to, the value of the subject of be, and the value of the subject of the verb approaching. Thus if the NP a unicorn in (68) has an existential quantifier in its , so does the value of the lowest verb in (68) – the verb that assigns a semantic role to the index bound by that quantifier. Pollard and Yoo propose to change the way storage works, so that unscoped quantifiers are passed up to the mother in a headed structure not from all the daughters (as in Cooper’s account or that of P&S), but only from the semantic head daughter. To achieve this, they let the value of a verb V be the set union of the values of V’s - members (at least those - members that are assigned a role in the value of V). We illustrate the effect of their proposal in terms of the Quantifier Amalgamation Constraint (69a), which is formulated in terms of the merge-quants relation defined in (69b).23 (69) a. Quantifier Amalgamation Constraint (preliminary): word: - 1 merge-quants (1) b. merge-quants( 〈 [ 1], . . . , [ o] 〉 ) = 1 c . . . c o On this approach, the of the verb in (70) is nonempty and may be passed up the tree from head-daughter to mother as sketched in (70). (70)
S [ 3]
[
1NP {some-person}]
VP 3 〈1〉
some person -
V some-person, 3 every-memo 〈1, 2〉 〈1〉 reads
[
2NP {every-memo}]
every memo
Let us ignore adjuncts for present purposes, considering only the case where the syntactic head and semantic head are the same, as in a structure 23
c here designates the relation of disjoint set union, which is defined exactly like familiar set union, except that its arguments must be disjoint sets (i.e. they must have an empty intersection). We will modify the definition of the Quantifier Amalgamation Constraint below.
The lexical integrity of Japanese causatives
69
like (70). S-level retrieval of stored quantifiers is done in accordance with the constraint sketched in (71):24 (71) Pollard and Yoo Quantifier Retrieval: 3 % 1 5 4 * 2 3 = order (2) -
1 5 4
And if we now reconsider the tree in (70) in light of the retrieval scheme sketched in (71), we can see the possibility of S-level quantifier retrieval of the sort sketched in (72): (72)
[
S {} some-person, order every-memo read-rel 4 i j
1NP {some-person}]
some person -
24
VP 3 〈 〉 4 〈1〉 〈 〉
V some-person, 3 every-memo 〈 〉 4 〈1, 2〉 〈1〉 〈2〉 24 reads
[
2NP {every-memo}]
every memo
We use * to designate a relation of contained set difference: if Σ2 is a subset of Σ1, then Σ1 * Σ2 is the standard set difference of Σ1 with respect to Σ2; otherwise, the contained
70
. , . ,
This correctly allows for both possible scopings for (72). It also assigns to (68) a reading where the subject has narrow scope with respect to seems, because is now part of and hence the value of seems is the value of to and be and hence is the value (and first - member) of approaching, which collects its own value from the values of its arguments. Thus the of approaching in (68) contains a-unicorn and that quantifier can be retrieved from storage anywhere in the tree higher than approaching. Retrieval at the VP node dominating just approaching will produce the scoping of a-unicorn inside the scope of seems. A potential problem with this approach, however, is that it lets retrieval happen in too many places. Unless one stipulates further constraints, this system (like the one in P&S) produces spurious analyses of every available reading. For example, allowing both S and VP retrieval in structures like (72) produces each possible scoping in three different ways (each retrieval order at one node, or one quantifier retrieved at each node). 4.3.2 Lexicalizing quantifier scoping The adaptation of the Pollard and Yoo analysis that we propose to solve the scope problems discussed in section 3.3.5 will at the same time eliminate this redundancy in the Pollard and Yoo system. We propose to let retrieval and scope assignment be entirely lexical in nature. By stating lexical constraints to the effect that a word’s value is an ordering of some set subtracted from the union of the values of the verb’s arguments, it is possible in fact to eliminate phrasal retrieval and the feature entirely. A lexical head passes up in its value the quantifiers from its arguments which are not already scoped in its value, or earlier retrieved in its value. These unscoped quantifiers are thus passed up into the value of the phrase projected by the lexical head. At this point, they are seen, and possibly retrieved by the head of the next higher syntactic domain. This proposal, similar in certain ways to lexical type raising, involves modifying the Quantifier Amalgamation Constraint as follows:25
25
set difference is not defined. Note also that up until now we have been showing the value of () as simply the () value which excludes the effect of quantification, but in the semantic theory of HPSG, a clausal has both a nucleus () and a list of the quantifiers scoped at that node (), as in the signs in this section. The function toplevel returns just the unembedded members of an - list. In other words, except for cases of nested - lists formed by complex predicates, it will also act as an identity function. It can be defined as follows: (i) toplevel(〈 〉) = 〈 〉. (ii) toplevel(〈X |Y 〉) = 〈X |Z〉 ← X is a synsem, toplevel(Y ) = Z. (iii) toplevel(〈X |Y 〉) = Z ← X is a list, toplevel(Y ) = Z
The lexical integrity of Japanese causatives
71
(73) Quantifier Amalgamation Constraint (revised): stem: - 1 merge-quants (toplevel(1) ) c 2 * 3 - 2 [ order(3)] The importance of the shift from word to stem, and the need to introduce a -() attribute will be discussed in a moment. For the moment, one can assume that the value of - is always the empty set. Now, first, note that in consequence of (73) the word reads (formed without relevant changes from the stem read ) must be constrained along the lines sketched in (74). (74)
word -
reads 〈NPi[ 1], NPj [ 2] 〉 (1 c 2) * 3 order(3) read-rel i j
Other aspects of the Pollard and Yoo theory remain unchanged. Thus, each lexical head gets a chance to scope the quantifiers of its role-assigned arguments, and those quantifiers from arguments that are not scoped remain in the verb’s to be passed up to higher levels of structure. There are exactly as many scope assignment points in a sentence as there are lexical heads. And since there is no structure-based retrieval, a sentence like (68) has no spurious semantic derivations. The constraints that are part of the lexical entry of the word reads simply allow two readings (corresponding to the two distinct orderings of the quantifiers on the verb’s list). Note finally that this modification of the Pollard and Yoo theory still produces the correct two readings for A unicorn seems to be approaching, allowing seems or approaching to assign scope to a-unicorn.26 4.3.3 Quantifier scope with morphological causatives Let us now return to our analysis of causatives. We have already modified the Quantifier Amalgamation Constraint as in (73). In consequence of this modification, the verb stem sirabe “check” must inherit all the constraints shown in (75): 26
In order to eliminate all spurious ambiguity in infinitival structures, we must state some further constraint ensuring that semantically vacuous raising verbs like to and be do not assign scope lexically. This is easily formulated as a lexical constraint requiring that elements like to and be identify their value with that of their complement (lexicalizing one part of a constraint proposed by Pollard and Yoo).
. , . ,
72
(75)
v-stem - -
sirabe 〈NPj [ 1], NPk[ 2] 〉 (1 c 2) * 3 {} order(3) check-rel j k
The constraints specified in (75) say simply (i) that the value of the stem’s is an ordering of some subset of the argument’s values and (ii) that those quantifiers not in are in the stem’s value. The lexical retrieval for a complex predicate is similar, but it must take into account the possibility that the stem from which it is formed may have already assigned scope to some but not necessarily all of the quantifiers from the embedded argument structure. This is where the attribute - comes in. We will say that for most basic stem types, including verb stems, the value of - is the empty set. But for certain derived stem types such as those licensed by complex-pred-drv, the value of - will be the set of quantifiers that were not yet scoped in the stem from which they were built. This will be achieved by the following revision to the constraints on the type complex-pred-drv:27 (76) complex-pred-drv:
- - -
〈1, 2, 4〈PRO, . . . 〉 〉 6 4 6
As a result, a causative stem, of type caus-stem, will obey the licensing constraints of complex-pred-drv above, and the further licensing constraints of caus-drv, and in addition the caus-stem will be subject to the revised version of the Quantifier Amalgamation Constraint. Application of the Quantifier Amalgamation Constraint to the caus-stem will allow there to be complements of the basic stem whose quantifiers scope over caus-rel, because the caus-stem inherits into its any elements in the of the stem (that is elements that were not scoped at the level of the stem), via the attribute -. On the other hand, the input stem to complex-pred-drv may have “already scoped” some of its quantifiers. That is, some quantifiers may appear in the stem’s value. Such elements will not appear in the of the stem (through the regular workings of the Quantifier Amalgamation Constraint), and so will not 27
The attribute - is taken from Przepiórkowski (1997), where it is used in the lexical entry of quantifier words to introduce new quantifiers. His paper also presents a development and further formalization of the approach to lexicalizing quantifier scoping introduced here.
The lexical integrity of Japanese causatives
73
appear in the - of the derived causative stem. At the level of the causative stem, the Quantifier Amalgamation Constraint collects only quantifiers in unembedded elements of the - – this is where the function toplevel comes in – and adds to those quantifiers any as yet unscoped quntifiers in the - set. As a result, each element of the - will end up being retrieved precisely once. The subject of the embedded stem (the PRO coindexed with the causee) of course cannot contribute to the embedded stem’s value, a fact that is simply accommodated if PRO’s is assumed to be empty. By this mechanism, complements of the entire causative verb can be assigned scope within the argument of cause-rel, but any elements that are not assigned narrow scope in this manner will be inherited by the causative stem and must be assigned wider scope somewhere within the sentence. We illustrate in (78) one possible way of instantiating the constraints we have outlined for a causative stem, namely the one corresponding to the problematic reading of (77) – where cause-rel outscopes 3-books which in turn outscopes check-rel : (77) Tanaka-sensei ga gakusei ni sansatu hon o sirabe-sase-ta. Prof. Tanaka student three book check-- “Prof. Tanaka made the student check three books.” (78)
caus-stem - -
sirabesase 〈1NPi[ 4] 〉 〈2NPj [ 5], 3NPk[ 6] 〉 〈1, 2, 〈PROj , 3〉 〉 (4 c 5 c 7) * 8 7{ } order(8) cause-rel i j order(6) check-rel
j k
Thus these lexical types allow exactly the desired result – the direct object of the lower verb may contribute a quantifier that scopes wide or narrow with respect to cause-rel. In the case of (77), the of the NP sansatu-no hon-o is the singleton set {3-books}, and this will serve to instantiate the tag 6 in (78), thus producing the desired narrow scope assignment. In sum, the lexically based revision of the Pollard and Yoo theory of quantifier storage and quantifier scoping that we have sketched seems to fit well with our theory of Japanese causatives. Although complex words of Japanese
. , . ,
74
preserve their lexical integrity (Bresnan and Mchombo 1995), NPs external to those words may still be assigned scope intermediate to the semantic elements of a lexicalized complex predicate like a causative. This result follows once verbal stems, rather than syntactic phrases or words, are taken as the locus for quantifier scope assignment.
4.4
Passives
A final point to be addressed is the interaction of causativization with passivization. It is well-known (e.g. Kuno 1973) that causatives of transitive verbs allow passivization of the -ni marked phrase, but not of the -o marked (lower) object, as in (79).28 (79) a. Mitiko ga Taroo ni Ziroo o yob-ase-ta. Mitiko Taroo Ziroo call-- “Mitiko made Taroo call Ziroo.” b. Taroo ga Mitiko ni Ziroo o yob-ase-rare-ta. Taroo Mitiko by Ziroo call--- “Taroo was made by Mitiko to call Ziroo.” c. *Ziroo ga Mitiko ni(-yotte) Taroo ni yob-ase-rare-ta. Ziroo Mitiko by Taroo call--- “(lit.) Ziroo was made called by Taroo by Mitiko.” In contrast, monomorphemic ditransitive verbs allow passivization of either object. Here we adopt Hasegawa’s (1981) suggestion that passivization of the lower object is impossible because it doesn’t have a thematic role (i.e. it is not the value of a role attribute) in the top-level of the clause. This suggestion receives independent support from the fact that this constraint appears to hold generally in Japanese. First, (80) shows that idioms (whose argument we assume not to be assigned a thematic role), cannot passivize (Yatabe 1990): (80) a. Kenitiroo ga saba o yon-da Kenitiroo mackerel read- “Kenichiroo gave a false count.” b. *saba ga Kenitiroo ni yom-are-ta mackerel Kenitiroo read-- 28
Ishikawa (1985) questions this generalization suggesting that passivization is possible in examples such as Hukei o yorokob-aseru tame, toku-ni muzukasii zi ga kodomotati ni kak-ase-rare-ta “In order to impress the parents, particularly difficult characters were caused (by the teachers) to be written by the children.” But our surveys suggest that such examples are judged unacceptable by the vast majority of Japanese native speakers.
The lexical integrity of Japanese causatives
75
Further, Kuno (1976) argues that (81a) is an example of raising-to-object in Japanese and he notes that the raised object fails to passivize as can be seen in (81b). (81) a. Noriko ga Masaru o hannin da to omot-ta. Noriko Masaru culprit is think- “Noriko thinks Masaru to be the culprit.” b. *Masaru ga Noriko ni hannin da to omow-are-ta. Masaru Noriko culprit is think-- “Masaru was thought to be the culprit by Noriko.” Thus the failure of passivization is expected on independent grounds, and provides no evidence against the merged argument structures embodied in our analysis.
5
Conclusion
This paper began by mentioning the many phenomena that motivate a lexical analysis of Japanese causatives. Such phenomena support the Lexical Integrity Principle of Bresnan and Mchombo (1995), and argue that Japanese causatives behave as a single clause with respect to case, word order, and similar properties. In addition, we have examined the diverse phenomena that have been assumed to motivate the multiclausal analysis of Japanese causatives, reaching the conclusion that all such data are handled at least as well or better within our single-clause, embedded argument structure approach. The solutions we have been led to bear on larger issues than the particulars of Japanese grammar discussed here. For example, our proposed solution to the problems posed by the interaction of adverb scope and causatives builds crucially on an analysis where heads select for their modifiers, rather than the familiar treatment where adjuncts select for the phrases that they combine with syntactically. The theory we have sketched here, unlike the more familiar alternative, provides a uniform account of both sublexical and supralexical scoping, and hence may deserve consideration as the basis for the treatment of adverbs more generally. Similarly, our lexicalized account of quantifier scoping, which eliminates syntactic retrieval entirely from the theory of Cooper storage, allows sublexical scoping of a sort that is inconsistent with other approaches, including that of Pollard and Yoo (1998). We started out with Kiparsky’s observation on the nonarbitrariness of the diverse properties that causative constructions exhibit. The account we have developed provides the beginnings of an explanation for the duality of causatives. Since case marking, agreement, and word order are all
76
. , . ,
determined by the interaction of principles constraining the way lexical items can appear in constituent structures, it follows that, with respect to these properties, morphological causatives behave just like other words of similar valence, exhibiting essentially all the properties of single lexical items. But construal processes such as honorification, binding, and quantifier floating are in general sensitive to argument structure, as Yatabe (1993) and Manning (1994) have observed. Thus our analysis of causative constructions in terms of complex argument structures leads us to predict evidence of embedding with all and only phenomena of this type. We find these results highly suggestive, not just for the treatment of causatives, but for the design of grammar in the broadest sense.
References Abeillé, Anne and Danièle Godard. 1994. The complementation of tense auxiliaries in French. In Proceedings of the Thirteenth West Coast Conference on Formal Linguistics. Stanford: Center for the Study of Language and Information. 157–172. Alsina, Alex. 1993. Predicate Composition: A Theory of Syntactic Function Alternations. Ph.D. dissertation, Stanford University. Borsley, Robert. 1989. Phrase-structure grammar and the Barriers conception of clause structure. Linguistics 27: 843– 863. Bresnan, Joan and Annie Zaenen. 1990. Deep unaccusativity in LFG. In Grammatical Relations. A Cross-Theoretical Perspective, ed. Katarzyna Dziwirek, Patrick Farrell, and Errapel Mejías-Bikandi. Stanford: Center for the Study of Language and Information. 45–57. Bresnan, Joan and Sam A. Mchombo. 1995. The Lexical Integrity Principle: evidence from Bantu. Natural Language and Linguistic Theory 13: 181–254. Butt, Miriam. 1993. The Structure of Complex Predicates: Evidence from Urdu. Ph.D. dissertation, Stanford University. Chew, J. 1961. Transformational Analysis of Modern Colloquial Japanese. Ph.D. dissertation, Yale University. Cooper, Robin. 1983. Quantification and Syntactic Theory. Dordrecht: Reidel. Copestake, Ann. 1992. The Representation of Lexical Semantic Information. Cognitive Science Research Papers 280. University of Sussex. Davis, Anthony. 1996. Lexical Semantics and Linking in the Hierarchical Lexicon. Ph.D. dissertation, Stanford University. Flickinger, Daniel. 1987. Lexical Rules in the Hierarchical Lexicon. Ph.D. dissertation, Stanford University. Grimshaw, Jane. 1990. Argument Structure. Cambridge, MA: MIT Press. Gunji, Takao. 1987. Japanese Phrase Structure Grammar: A Unification-Based Approach. Dordrecht: Reidel. Harada, S.-I. 1973. Counter-Equi NP deletion. Research Institute of Logopaedics and Phoniatrics, University of Tokyo. Annual Bulletin 7 : 113–148. Hasegawa, Nobuko. 1981. Lexicalist grammar and Japanese passives. Coyote Papers 2: 25– 40. University of Arizona, Tucson. Hinds, John. 1973. Some remarks on soo su-. Papers in Japanese Linguistics 2: 18 –30.
The lexical integrity of Japanese causatives
77
Iida, Masayo. 1992. Context and Binding in Japanese. Ph.D. dissertation, Stanford University. Published 1996. Stanford: Center for the Study of Language and Information. Ishikawa, Akira. 1985. Complex Predicates and Lexical Operations in Japanese. Ph.D. dissertation, Stanford University. Kathol, Andreas. 1995. Linearization-Based German Syntax. Ph.D. dissertation, Ohio State University. Kim, Jong-Bok and Ivan A. Sag. 1995. The parametric variation of English and French negation. In Proceedings of the Fourteenth West Coast Conference on Formal Linguistics. Stanford: Center for the Study of Language and Information. Kiparsky, Paul. 1982. Lexical morphology and phonology. In Linguistics in the Morning Calm, ed. The Linguistic Society of Korea. Seoul: Hanshin. 3–91. Kiparsky, Paul. 1987. Morphology and Grammatical Relations. MS, Stanford University. Kitagawa, Yoshihisa. 1986. Subjects in Japanese and English. Ph.D. dissertation, University of Massachusetts, Amherst. Published 1994. New York: Garland. Kuno, Susumu. 1973. The Structure of the Japanese Language. Cambridge, MA: MIT Press. Kuno, Susumu. 1976. Subject Raising in Japanese. In Syntax and Semantics 5: Japanese Generative Grammar, ed. Masayoshi Shibatani. New York: Academic Press. 17–49. Kuroda, Sige-Yuki. 1965. Generative Grammatical Studies in the Japanese Language. Ph.D. dissertation, Massachusetts Institute of Technology. Kuroda, Sige-Yuki. 1981. Some recent trends in syntactic theory and the Japanese language. Coyote Papers 2: 103 –121. University of Arizona, Tucson. Manning, Christopher D. 1994. Ergativity: Argument Structure and Grammatical Relations. Ph.D. dissertation, Stanford University. Published 1996. Stanford: Center for the Study of Language and Information. Manning, Christopher D. and Ivan A. Sag. 1998. Argument Structure, valence and binding. Nordic Journal of Linguistics 21: 107–144. Marantz, Alec. 1982. Re reduplication. Linguistic Inquiry 13: 435– 482. McCawley, James D. 1968. The Phonological Component of a Grammar of Japanese. The Hague: Mouton. Meurers, Detmar. 1995. Towards a Semantics for Lexical Rules as used in HPSG. Paper presented at the Conference on Formal Grammar, Barcelona, Spain, at the Tübingen HPSG workshop, and the ACQUILEX II Workshop on Lexical Rules, Cambridge, UK. [Revised version available at http://www.sfs.nphil.unituebingen.de/ dm/.] Miller, Philip H. 1991. Clitics and Constituents in Phrase Structure Grammar. Ph.D. dissertation, Rijksuniversiteit Utrecht. Published 1993. New York: Garland. Miller, Philip H. and Ivan A. Sag. 1997. French clitic movement without clitics or movement. Natural Language and Linguistic Theory 15: 573– 639. Miyagawa, Shigeru. 1980. Complex verbs and the lexicon. Coyote Working Papers 1. University of Arizona, Tucson. Miyagawa, Shigeru. 1989. Structure and Case Marking in Japanese. San Diego: Academic Press. Momoi, Katsuhiko. 1985. Semantic roles, variation, and the Japanese reflexive. University of Chicago Working Papers in Linguistics 1: 73– 92. Montague, Richard. 1974. The proper treatment of quantification in ordinary English. In Formal Philosophy, ed. Richmond Thomason. New Haven: Yale University Press.
78
. , . ,
Muraki, Masatake. 1978. The sika nai construction and predicate restructuring. In Problems in Japanese Syntax and Semantics, ed. John Hinds and Irwin Howard. Tokyo: Kaitakusya. 155–177. Pollard, Carl and Ivan A. Sag. 1987. Information-Based Syntax and Semantics, vol. 1. CSLI Lecture Notes Series no. 13. Stanford: Center for the Study of Language and Information. Pollard, Carl and Ivan A. Sag. 1992. Anaphors in English and the scope of binding theory. Linguistic Inquiry 23: 261–303. Pollard, Carl and Ivan A. Sag. 1994. Head-Driven Phrase Structure Grammar. Chicago: University of Chicago Press and Stanford: Center for the Study of Language and Information. Pollard, Carl and Eun Jung Yoo. 1998. A unified theory of scope for quantifiers and whphrases. Journal of Linguistics 34: 415 – 445. Poser, William J. 1984. The Phonetics and Phonology of Tone and Intonation in Japanese. Ph.D. Dissertation. Massachusetts Institute of Technology. Poser, William J. 1989. What is the “Double-o Constraint” a constraint on? MS, Stanford University. Poser, William J. 1992. Blocking of phrasal constructions by lexical items. In Lexical Matters, ed. Ivan Sag and Anna Szabolcsi. Stanford: Center for the Study of Language and Information. 111–130. Przepiórkowski, Adam. 1997. Quantifiers, Adjuncts as Complements, and Scope Ambiguities. MS, University of Tübingen. Rappaport, Malka and Beth Levin. 1988. What to do with theta-roles. In Syntax and Semantics, vol. 21: Thematic Relations, ed. Wendy Wilkins. San Diego: Academic Press. 7–36. Riehemann, Susanne. 1993. Word Formation in Lexical Type Hierarchies: A Case Study of bar-Adjectives in German. Master’s thesis, University of Tübingen. Riehemann, Susanne. 1995. Type-Based Morphology. MS, Stanford University. Sag, Ivan A., and Carl Pollard. 1991. An integrated theory of complement control. Language 67: 63–113. Saiki, Mariko. 1987. On the Manifestations of Grammatical Functions in the Syntax of Japanese Nominals. Ph.D. dissertation, Stanford University. Shibatani, Masayoshi. 1973. Semantics of Japanese causativization. Foundations of Language 9: 327–373. Shibatani, Masayoshi. 1990. The Languages of Japan. Cambridge: Cambridge University Press. Sugioka, Yoko. 1984. Interaction of Derivational Morphology and Syntax in Japanese and English. Ph.D. dissertation, University of Chicago. Takubo, Yukinori. 1990. On the role of hearer’s territory of information – a contrastive study of dialogic structure in Japanese, Chinese, and English as manifested in the third person pronoun system. In Advances in Japanese Cognitive Science, vol. 3. Tokyo: The Japanese Cognitive Science Center. 66– 84. Van Noord, Gertjan, and Gosse Bouma. 1994. Adjuncts and the processing of lexical rules. Proceedings of Coling 1994, Kyoto. 250 –256. Wechsler, Stephen. 1995. The Semantic Basis of Argument Structure. Stanford: Center for the Study of Language and Information.
The lexical integrity of Japanese causatives
79
Xue, Ping, Carl Pollard, and Ivan A. Sag. 1994. A new perspective on Chinese ziji. In Proceedings of the Thirteenth West Coast Conference on Formal Linguistics. Stanford: Center for the Study of Language and Information. 432– 447. Yatabe, Shûichi. 1990. The Representation of Idioms. MS, Stanford University. Yatabe, Shûichi. 1993. Scrambling and Japanese Phrase Structure. Ph.D. dissertation, Stanford University.
2. A syntax and semantics for purposive adjuncts in HPSG . . 1 University of California at Santa Cruz 2
1
Introduction
The appropriate characterization of the meaning of the preposition “for” has been a recalcitrant problem for theories of syntactic and semantic relations (Fillmore 1968, Platt 1971, Green 1974, Allerton 1982). In this paper, I investigate the semantic interpretation of “for” prepositional phrases in English and examine the parallels that exist between “for” PPs and infinitival purpose clauses. I propose that both “for” PPs and infinitival purposives should be analyzed as adjuncts which introduce a higher-order relation of purpose which holds between the eventuality described by the clause which the adjunct modifies and a potential eventuality introduced by the purposive adjunct itself. The critical difference between infinitival purposives and “for” PPs is that in the former case the eventuality introduced by the adjunct is explicitly described by the adjunct clause, while in the latter case, the nature of the eventuality has to be encoded in the lexical semantics of “for.” I will call these descriptions of eventualities parameterized-states-of-affairs (psoas). I develop an analysis of “for” PPs and infinitival purposive constructions in head-driven phrase structure grammar (HPSG, Pollard and Sag 1987, 1994). This analysis effectively captures the parallels in function and distribution between “for” PPs and purpose clauses. The analysis is further developed to 1
2
I would like to thank William A. Ladusaw for his supervision of the research project of which this paper is a product. He was a constant source of encouragement and insightful advice and suggestions. I would also like to thank Donka Farkas, Sandy Chung, Ivan Sag, Dan Flickinger, Georgia Green, Stephen Wechsler, Louise McNally, Michelle Hart, Kari Swingle, Chris Kennedy, and Chuck Wallace and to acknowledge the support of Linguistics Research Center at UC Santa Cruz. Thanks are also due to the anonymous reviewers of this paper for their helpful and illuminating comments. This research was done while the author was at the University of California, Santa Cruz.
A syntax and semantics for purposive adjuncts in HPSG
81
account for NP-modifying “for” PPs and infinitival purpose clauses. The paper is structured as follows. In section 2, I introduce the different interpretations of “for” PPs that I am concerned with and elucidate the parallels that exist between “for” PPs and infinitival purposives. In section 3, I present analyses of “for” PPs, infinitival purposives, and NP-modifying purposives. In section 4, I summarize the results of the paper and discuss some remaining issues.
2
“For” PPs as expressions of purpose
There are three different interpretations of a “for” PP which share the property that they specify the purpose of the psoa described by the clause they modify. I will first distinguish these different purposive senses of “for,” and then go on to show how they are similar to infinitival purposives. Throughout this paper, I will refer to the clause which the adjunct modifies as the main clause. The eventuality described in the main clause will be referred to as the main clause psoa. I will refer to the clause in an infinitival purposive as the adjunct clause. The eventuality introduced by a “for” PP or an infinitival purposive will be referred to as the adjunct psoa.
2.1
The different purposive “for” PPs
A prepositional phrase headed by “for” can be used to specify the purpose of the main clause psoa.3 There are three types of “for” PPs which I am going to argue express a purpose relation. The first two of these were grouped together with the name “benefactives” in traditional grammar. The third set introduce an object which the agent of the main clause intends to acquire. I will return to these after discussion of the “benefactives.” 3
There are several other uses of “for” PPs which I am not concerned with in this paper. For example, “for” can be used to identify one of the complements of a verb, as in (i)–(iii). (i) Sandy looked for Mary in the garden. (ii) The colonel asked the waiter for the bill. (iii) Mrs. Chippam mistook the customer for an assistant. I assume these verbs “look,” “ask,” and “mistake” subcategorize for “for” PPs and that “for” simply acts as a case marker which mediates the linking relation between an argument position in the semantics of the verb and a syntactic complement. I am not concerned here with the use of “for” to express duration as shown in (iv). (iv) Dorothy lived in Kansas for ten years. Neither am I concerned with the use of a “for” PP to express point of view, as in (v). (v) For Mary, the weekend in Venice seemed most enchanting. The contribution of the “for” PP here is that the proposition “the weekend in Venice seemed most enchanting” should be attributed to Mary.
82
. .
2.1.1 Benefactives The intuition behind the term “benefactive” was that the phrases in question introduced a participant who benefited in some way from the psoa described in the sentence. For example, consider (1). (1)
John baked a cake for Mary.
Example (1) has an easily available interpretation where John intends to please Mary by baking the cake.4 I will refer to this as the benefactive interpretation of a “for” PP. While this is a possible interpretation of a “for” PP, it is not the only one. There is another interpretation which is recurrent and salient. The “for” PP “for Mary” in (1) can also mean that John intended Mary to receive the cake that he baked.5 I am going to refer to this interpretation as the recipient interpretation of a “for” PP. These two interpretations are independent of each other. It is possible for a “for” PP to have the recipient interpretation and not the benefactive interpretation and vice-versa. In an example like (1), it is not a necessary part of the satisfaction conditions that the action of baking should please Mary. It may be that John knows that Mary hates cakes and he simply wants her to have a cake that he made. We can make the absence of an obligatory benefactive element of the satisfaction conditions more salient by loading the context. For example, consider (2) and (3). (2) (3)
John bought the castor oil for the kids. (recip) John prepared the poison for the king. (recip)
In (2), John probably does not intend to please the kids with the castor oil, and on the interpretation of (3) where John intends the king to consume the poison, John most probably does not intend to please the king. There is a variety of evidence that these benefactive and recipient interpretations of a “for” PP correspond to lexical ambiguity of “for.” First of all, such a distinction is plausible since in other languages, such as Spanish, different prepositions are used for these two senses. In Spanish, “por” is used for the benefactive interpretation, while “para” is used for the recipient interpretation. The hypothesis that the recipient and benefactive readings for a “for” PP correspond to different senses is further supported by the fact that the recipient interpretation and benefactive interpretation impose differing constraints on the clause which they modify. The recipient interpretation is only available when the main clause has an object. If the main clause is intransitive, there is no possibility of a recipient interpretation of a “for” PP, and only the benefactive interpretation is available. For example, both (4) and (5) only have a benefactive interpretation and do not have a recipient interpretation. 4
5
This led Fillmore (1968: 32) to use the label “benefactive” as the case assigned to “for” PPs, although in that paper he does not give an explicit account of their meaning. Many authors have used “benefactive” as the case assigned to an entity gaining possession of something, Chafe (1970: 48), Platt (1971: 48), Halliday (1971: 147). Green (1974: 98) refers to this sense as “the ‘have’ relationship.”
A syntax and semantics for purposive adjuncts in HPSG
(4) (5)
83
John left early for Mary. (*recip) John ate early for Mary. (*recip)
The recipient interpretation not only requires that the verb have an object; the object has to be one which is created, transferred, or transformed in some way by the action described by the verb, as in examples (6), (7), and (8) respectively. Following Allerton (1982), I will refer to this sort of object as an affected object. (6) (7) (8)
Creation: Oliver baked a cake for Elizabeth. (recip) Transfer: Oliver bought the candy for Elizabeth. (recip) Transformation: Oliver cleaned some shoes for Elizabeth. (recip)
The recipient interpretation of “for” is not available in the following examples because the object in the main clause is not an affected object. (9) John watched a film for Mary. (*recip) (10) John visited Australia for his mother. (*recip) The recipient interpretation of (9) is impossible because the film is not affected by John’s watching it. Also there is no recipient interpretation of (10) because Australia is not affected by John’s visit. The benefactive interpretation is readily available in these examples though. This restriction on the recipient interpretation provides evidence that the recipient and benefactive interpretations are in fact different senses of a “for” PP. Further distinctions between these two interpretations will be discussed in the next section when I discuss the parallels between “for” PPs and infinitival purposives. I assume henceforth that the recipient and benefactive interpretations correspond to two different senses of “for.” While the recipient and benefactive senses of “for” impose different restrictions on the clause they modify, their contribution to the meaning of an utterance is very similar. They are both expressions of intention or purpose. In the case of the recipient sense, “for” introduces not necessarily the actual recipient, but rather the intended recipient. Under the recipient interpretation of “for Mary,” (1) does not describe a psoa in which Mary necessarily receives “a cake”; that is, it is not part of the satisfaction conditions of the interpretation that the entity denoted by the complement of “for” receive the affected object in the clause. However, (1) can describe a psoa in which John intends Mary to receive the cake that he bakes; that is, Mary is the intended recipient. The intentional nature of the recipient interpretation is highlighted by comparison with recipients in PPs headed by “to.” These are complements of certain verbs such as “give,” “bring,” and “take,” as in (11). (11) John gave/brought/took a book to Mary. (actual recipient) These verbs can also appear with a “for” PP with the recipient interpretation, as in (12).
84
. .
(12) John gave/brought/took a book for Mary. (intended recipient) For (11) to be true Mary must receive the book, while for (12) to be true it is only necessary that John intends Mary to receive the book. Similarly, benefactive “for” PPs express not that the entity denoted by the complement of “for” was in fact pleased by the action, but rather that the main clause psoa was carried out with the intention that that entity be pleased as a result. For example, (13) does not entail that Mary was pleased by John’s going early to work, but rather that John intended that his going to work early should please Mary. (13) John went to work early for Mary. This similarity between the recipient and benefactive interpretations can be captured if we assume that they are both expressions of purpose and that they both introduce a higher-order relation of purpose between two psoas. They differ in the relation in the adjunct psoa but they share the same higher level relation of purpose. Before going into the analysis in more detail, I am going to introduce two further purposive interpretations of “for” PPs, the deputive and acquire interpretations, and show how “for” PPs parallel infinitival purposives. 2.1.2 The deputive interpretation There is a further interpretation of a “for” PP in which the complement of “for” denotes a person on whose behalf the agent of the main clause performs the action described in the main clause. For example, consider (14) and (15). (14) John visited Australia for his mother. (deputive) (15) John put the olives in the martinis for his mother. (deputive) The “for” PPs in (14) and (15) can be interpreted as introducing the person on whose behalf the agent of the main clause performs the action. Allerton (1982: 104) refers to this interpretation as the deputive interpretation.6 Like the recipient interpretation and the benefactive interpretation, this interpretation contributes the purpose for the agent’s action. In most examples where a deputive interpretation is possible, a benefactive interpretation is possible also. Furthermore, these two interpretations do not impose different restrictions on the main clause. It may be that there should be another sense of “for” which is responsible for the deputive interpretation, but in this paper I am going to assume that they are both instances of the benefactive sense of “for.” 6
Somers (1987: 34–35) discusses the history of this interpretation of a “for” PP. It was one of the interpretations that Chafe (1970: 151) and Platt (1971: 30, 50f ) attributed to the “benefactive” case.
A syntax and semantics for purposive adjuncts in HPSG
85
2.1.3 The acquire sense of “for” There is a further purposive interpretation of a “for” PP under which the complement of “for” is an object which the agent of the main clause psoa intends to acquire. I will refer to this as the acquire interpretation. (16) John killed for food. (acquire) (17) John killed for love. (acquire) In (16) and (17), John killed in order to acquire “food” and “love” respectively. This must also be an independent sense of “for” since, unlike the recipient interpretation, there is no restriction to clauses which describe psoas with an affected object. It cannot be grouped with the benefactive interpretation since it never means that the complement of “for” is pleased or benefits from the action described. The thing this sense of “for” shares with the benefactive and recipient interpretations is the intentional element of its meaning. It expresses the fact that the agent of the main clause intends to acquire the object described by the complement of “for.” The acquire sense of “for” does not entail that the agent actually does acquire the object, only that they intended to as a result of their action. I propose that this sense of “for” should also introduce a parameterized state-of-affairs that expresses a purpose relation between the main clause psoa and an adjunct psoa. This would only differ from the other purposive senses of “for” in the nature of the adjunct psoa, which in this case would describe the agent of the main clause acquiring the object described by the complement of “for.” 2.1.4 Summary In summary, I have identified three separate senses of a “for” PP which all function to describe the purpose of the parameterized state-of-affairs described by the clause they modify. These senses are responsible for the recipient, benefactive, and acquire interpretations. I proposed that all three senses introduce a psoa with the relation purpose which takes two psoas as its arguments. In each case, the first psoa is the main clause psoa. The second psoa is the adjunct psoa and differs from sense to sense. Further evidence for analyzing “for” PPs as purposive constructions comes from their comparison to infinitival purpose clauses, and in particular from the fact that there is a subclassification into object-oriented versus event-oriented purposives which cuts across both categories. These parallels are explored in the next two sections.
2.2
Infinitival purposives
Infinitival purposives are expressions such as “for the children to give to Mary” and “in order to give it to Mary” as in (18) and (19). (18) Marty baked it for the children to give to Mary. (19) John baked a cake in order to give it to Mary.
86
. .
Like the “for” PPs discussed in the last section, infinitival purposives introduce an eventuality that does not necessarily actually take place. In the examples above the eventuality of giving is not entailed to actually take place, rather it is stated as part of the intentions of the agent of the main clause. There are a number of striking parallels between “for” PPs and infinitival purposives. The first I am going to address concerns the interpretation of indefinites in these constructions. 2.2.1 Parallels with respect to specificity The intensional nature of both “for” PPs and infinitival purposives is reflected by the fact that if there is an indefinite NP as the complement of “for” or as an argument in the infinitival purposive, that NP can have a epistemically specific or an epistemically nonspecific interpretation. As an example, consider (20) with a recipient interpretation of “for a talented actor.” (20) The director designed it for a talented actor. Under the specific interpretation there is one particular talented actor that the director has in mind when he designs it, while under the nonspecific reading, the director does not have a particular actor in mind but intends that some individual who is a talented actor should receive it. The same ambiguity exists when an indefinite appears in an infinitival purposive, as in (21). (21) The director designed a part that had many lines for a talented actor to play. (22) An actor plays a part. The indefinite “a talented actor” can have either a specific or a nonspecific reading. On the specific reading, the director designs the part with the intention that a certain actor should play it. On the nonspecific interpretation, the director designs that part with the intention that some talented actor should play it, but without a particular talented actor in mind. Nonspecific interpretations of indefinites are also possible in the main clause, but they are generally associated with generic interpretations, as in (22). What is crucial here is that the indefinite can have a non-specific interpretation when the sentence does not have a generic interpretation. If we assume that both “for” PPs and infinitival purposives introduce a purpose relation to the semantics, and that associated with this relation there is a modal operator which anchors the psoa introduced by the purposive to the intentions of the agent of the clause modified, this ambiguity could be analyzed as the result of whether the indefinite has wide or narrow scope with respect to this modal operator. It is implicit in my analysis that the psoa introduced by the purposive adjunct is modally subordinated. In this paper, I will not formalize the modal operator in such a way that this specificity effect can be formalized. The further parallels that I want to demonstrate all relate to distinctions between infinitival purposives that match up to related distinctions among “for” PPs. In the next section, I describe the different types of infinitival purpose clauses and how to distinguish them.
A syntax and semantics for purposive adjuncts in HPSG
2.3
87
The different types of infinitival purposives
Bach (1982), Jones (1985, 1991), and Wallace (1986) discuss a subclassification of infinitival purposives into two classes. I will call these two classes true purpose clauses (PCs) and rationale clauses (RatCs). (23) John gave the piano to Mary for her to practice on. (PC) (24) John gave the piano to Mary (in order) for her to practice on it. (RatC) The italicized expressions in (23) and (24) both express the intentions of the agent John in giving the piano to Mary. The expression in (23) is a true purpose clause and the expression in (24) is a rationale clause. The immediately visible syntactic difference between a PC and a RatC is that a RatC may be optionally preceded by the words “in order,” as in (24), while PC may not, as in (25). (25) *John gave the piano to Mary in order for her to practice on. (PC) A further difference between PCs and RatCs is that PCs must have a gap in them which is interpreted as coreferential with an object in the clause they modify. The determination of the antecedent of this gap is generally assumed to be determined according to some theory of control be it configurational, functional, semantic, or pragmatic. The gap may be in the subject position or in an object position. Jones (1985, 1991) distinguishes the two types, naming them SPC and OPC respectively. The PC in (26) is an OPC while the example in (27) is a SPC. I will use the term PC as a cover term for both SPCs and OPCs. (26) John bought a book for Mary to read– to the children. (OPC) (27) Lord James hired a boy– to clean out the stables. (SPC) RatCs may only have a gap in the subject position and not in an object position, as can be seen in (28). (28) *John bought a book in order for Mary to read– to the children. (RatC) (29) Lord James hired a boy in order– to clean out the stables. (RatC) In (29) the antecedent of the subject gap is not the boy, as it was in (27) but in fact Lord James. The controller of a subject gap in a RatC must be the person who is responsible for the psoa which the RatC modifies. PCs may have both an object and a subject gap, in which case the object gap will be controlled by the shared object between the PC and the main clause. The subject gap is generally controlled by the person responsible for the main clause psoa.7 7
It may also have a somewhat more free interpretation. For example: (i)
I brought along this wine– to drink with dinner.
The subject of “drink” may be interpreted as the speaker, or as the group of diners.
88
. .
For the discussion that follows it is important to clarify the distinction between RatCs with a subject gap and subject gap purpose clauses (SPCs). Certain forms meet the structural criteria for both types of purposives. For example, since it has a subject gap, (30) could be an SPC. However, since “in order” is optional in RatCs and the subject gap in the adjunct is controlled by the person ( John) responsible for the main clause psoa, (30) could also be a RatC. This is further supported by the addition of “in order” as in (31). (30) John left early to go to school. (31) John left early in order to go to school. To more clearly see the difference between subject gap RatCs and SPCs we need to consider an example where the subject gap in the adjunct is not controlled by the person responsible for the main clause psoa. Example (32), where the main clause object controls the subject gap in the adjunct, is such a case, and as we would expect “in order” is not possible ((33)). (32) John invited her to sing at the concert. (33) *John invited her in order to sing at the concert. In the following discussion I will use SPCs of this form rather than ambiguous cases such as (30) when illustrating the various differences between PCs and RatCs. Having established these two basic criteria for distinguishing PCs and RatCs, I now discuss the other differences between them and show how these differences parallel the differences between recipient “for” PPs and the other types of “for” PPs, addressing, in turn, the restriction to main clause psoas with an affected object, parallels involving pseudoclefts and VP-anaphora, constraints on the interpretation of sequences of purposives, future-orientation, and NP-modification possibilities. 2.3.1 The restriction to psoas with an affected object PCs and RatCs differ in that only RatCs are possible with intransitive verbs, as shown by (34) and (35). (34) John left early in order for Mary to move in. (35) *John left early to take to school. The crucial difference between PCs and RatCs is that, while both express a psoa which is the purpose of the clause they modify, PCs require that there be some object in the main clause which is involved in the adjunct psoa. The psoa introduced by a RatC may involve an object from the main clause, but this is not necessary. This distinction between PCs and RatCs parallels the distinction between recipient “for” PPs and benefactive and acquire “for” PPs. Like PCs, recipient “for” PPs require that there be a participant shared by the main clause psoa and the adjunct psoa, and they are not possible with intransitives. Like RatCs, the benefactive and acquire “for” PPs do not
A syntax and semantics for purposive adjuncts in HPSG
89
require that there be a participant shared by the main clause psoa and the adjunct psoa. I discussed the fact above that recipient “for” PPs require that the main clause psoa involve an affected object. PCs appear to impose the same restriction on the clause they modify. Bach (1982: 38) observes the following three sorts of contexts in which a PC is possible: I. Have/Be. (36) John has an umbrella in the closet to use when it rains. (37) The umbrella is kept in the closet for you to use when it rains. II. Transitive verbs which involve a continuance or change in the state of affairs of a positive sort. (38) I bought the umbrella to use when it rains. (39) I baked a cake to eat with dinner. III. Verbs of choice and use. (40) I chose a Jane Austen novel to read to the students. A recipient “for” PP is possible in all of these cases as shown in (41)–(45). I. Have/Be. (41) John has an umbrella in the closet for the night watchman. (42) The umbrella is kept in the closet for the night watchman. II. Transitive verbs which involve a continuance or change in the state of affairs of a positive sort. (43) I bought the umbrella for the night watchman. (44) I baked a cake for Mary. III. Verbs of choice and use. (45) I chose a Jane Austen novel for the students. The majority of these examples are cases where the object is created, transferred, or transformed in some way. These are what, in keeping with Allerton (1982), I have been calling “affected objects.” It is not immediately clear that the objects of “have” and “be” are affected objects from the semantics of these verbs, but both recipient “for” PPs and PCs are possible in these cases. It is the case, however, that the objects in these examples have been prepared in some way, even though it is not made explicit by the verb. For example, in (36), “John has an umbrella in the closet to use when it rains,” the umbrella is an affected object, not in the semantics of “have,” but in the context. Part of the context is that John placed the umbrella in the closet and it is for this reason that the umbrella is an affected object. In the analysis that follows, I will not attempt to account for this licensing from context but concentrate on
90
. .
examples where the object is an affected object by virtue of being assigned an appropriate semantic role. The following examples serve to further emphasize this parallel between recipient “for” PPs and PCs. (46) (47) (48) (49)
*John read a book to review. (PC) John read a book for Mary. (*recip) *John watched a film for Mary to watch. (PC) John watched a film for Mary. (*recip)
The objects of “read” and “watch” are not affected objects. Since both PCs and recipient “for” PPs require an affected object they should not be possible with “read” or “watch”. The impossibility of a PC with these verbs is supported by the ungrammaticality of (46) and (48). The impossibility of a recipient “for” PP is shown by the fact that (47) and (49) can only have a benefactive interpretation and cannot have a recipient interpretation. A further parallel between recipient “for” PPs and PCs shows up in examples with pseudoclefts and VP-anaphora. 2.3.2 Pseudoclefts and VP-anaphora I have established that both PCs and recipient “for” PPs require that there be an affected object in the main clause psoa. The benefactive and acquire interpretations are like RatCs in that they do not impose this restriction on the main clause psoa. A recipient interpretation of a “for” PP is not possible when it does not directly follow the VP, as in pseudocleft constructions. In (50), “for Mary” is separated from the VP by the clefting and the recipient interpretation is not available. In (51), “for Mary” is clefted with the VP and the recipient interpretation is available. (50) What John did for Mary was [buy some candy]. (*recip) (51) What John did was [buy some candy for Mary]. (recip) The benefactive interpretation of the “for” PP is possible in both of these examples. The behavior of PCs is parallel in that they are only possible when they directly follow the VP, as shown by (52) and (53) (OPC) and (54) and (55) (SPC), while RatCs are possible in both positions, as shown by (56) and (57). (52) (53) (54) (55) (56) (57)
What John did was buy some candy to give to Mary. (OPC) *What John did to give to Mary was buy some candy. (OPC) What John did was invite her to sing at the concert. (SPC) *What John did to sing at the concert was invite her. (SPC) What John did was leave the country (in order) to please Mary. (RatC) What John did (in order) to please Mary was leave the country. (RatC)
VP-anaphora exposes a similar parallel between recipient “for” PPs and PCs. A recipient interpretation of a “for” PP is not possible when it modifies a pronoun “it” which is in an anaphoric relation with a VP with an affected
A syntax and semantics for purposive adjuncts in HPSG
91
object, as in (58). The benefactive interpretation of “for” is perfectly acceptable in this example. (58) John bought cookies and he did it for Mary. (*recip) A PC is not possible in this environment either, as shown in (59) for OPCs and (60) for SPCs, while RatCs are possible, as shown in (61). (59) *John bought cookies and he did it to give to Mary. (OPC) (60) *John invited her and he did it to sing at the concert. (SPC) (61) John left the country and he did it (in order) to please Mary. (RatC) These similarities between recipient “for” PPs and PCs, and benefactive “for” PPs and RatCs provide further support for the proposal that both “for” PPs and infinitival purposives are essentially the same kind of semantic entity. They are both expressions of purpose. The similarity between recipient “for” PPs and PCs can be captured by assuming they both constrain the main clause psoa to having an attribute which is an affected object. Interestingly, the PCs and the recipient interpretation of a “for” PP are possible in cases of null VP anaphora as shown by (62) and (63). (62) Bill bought cookies for Mary and John did for Sue. (recip) (63) Bill bought cookies to give to Mary and John did to give to Sandy. (OPC) The “it” anaphora in (58)–(61) is what Hankamer and Sag (1976) call deep anaphora, and the ellipsis in (62) and (63) is what they call surface anaphora. The recipient “for” PP and the PC are only possible in surface anaphora cases and not in deep anaphora cases. In the analysis proposed in section 3, the requirement that there be an affected object is local in that there has to be an attribute which is an affected object in the psoa introduced by the expression which the adjunct modifies, the main clause psoa. The restriction of PC and recipient “for” PP modification to surface anaphora cases will follow if in such cases the elided VP material is actually present in the semantic representation, because that would mean there would be a “buy” relation with an object which is an affected object in the semantic representation of the second conjunct. In the deep anaphora cases, in the semantic representation of the second conjunct there is a semantic translation for “it” rather than a copy of the “buy” psoa from the first conjunct. Since there is no affected object in the semantic representation of what the adjunct modifies, the PC and recipient “for” PP are not possible. This constraint also shows up in cases where there is more than one purposive modifying a clause, and it is to those cases that I now turn. 2.3.3 Constraints on sequences of purposives “For” PPs and infinitival purposives are also alike in that there can be more than one in a sentence. They share the property that when there is more than
92
. .
one, the purposes must be interpreted recursively. What this means is that if there is more than one the second one specifies the purpose of the complex psoa introduced by the first. I will consider “for” PPs first of all. Gawron (1986: 376 –378) observed that in an example with more than one “for” PP the second “for” PP modifies the complex situation created by the addition of the first “for” PP. (64) Bob made a sweater for Sue for Mary. Example (64) describes a situation where Bob’s making a sweater is to please Sue and that whole action of making it to please Sue is to please Mary. He assumes a unitary “benefactive” interpretation and does not consider the recipient sense of “for.” Given the assumption that each “for” PP in (64) can have more than one interpretation, we would expect there to be a variety of different readings depending on which of the interpretations each “for” PP has. Consider the following example, in which there is no possibility of the first “for” PP modifying the object of the verb directly, since the object is the pronoun “it.” (65) Peter baked it for Mary for her kids. This can mean that Peter baked it, say a cake, with the intention that Mary receive the cake and that as a result of that her kids receive the cake. I assume that this is the result of both “for” PPs having the recipient interpretation. Example (65) can also mean that Peter baked it with the intention that Mary receive it and furthermore that Mary’s receiving the cake would please her kids. I assume that this is the result of the first “for” PP having the recipient interpretation and the second having the benefactive interpretation. This example can also mean that Peter baked it in order to please Mary and he intended that to please her kids. Interestingly the interpretation where the first “for” PP is benefactive and the second is recipient is not available. This would mean that Peter baked it to please Mary and he had the intention that the kids receive it. This is another manifestation of the local nature of the restriction of recipient “for” PPs to modifying main clauses that describe psoas which have an affected object. The first “for” PP introduces a please relation. Given the assumption that purposives are interpreted recursively, the second “for” PP has to modify this pleasing relation. The please relation does not have an affected object so the recipient interpretation is unavailable. Infinitival purposives behave in a parallel fashion. They are also necessarily interpreted recursively. (66) John baked it to take to the party to give to Mary. (67) John left the room in order to please Mary in order to cheer up Sandy. Example (66) has to mean that John baked it so that it could be taken to the party and as result of that it could be given to Mary. In (67) the purposives are RatCs. They cannot be interpreted as independent expressions of the
A syntax and semantics for purposive adjuncts in HPSG
93
purpose of John leaving the room. The purpose of John’s leaving the room is to please Mary and furthermore by pleasing Mary he intends to cheer up Sandy. Given the parallels we have already seen between recipient “for” PPs and PCs, in particular the proposal that they both require an affected object in the main clause psoa, we would expect a PC to be unable to follow a purposive that does not have an affected object. Wallace (1986: 6–7) discusses the fact that when both a PC and a RatC occur in the same clause, the RatC must follow the PC, as shown by (68) and (69). (68) I bought it[i ] [to wear e[i ]] [to please my mother]. (69) *I bought it[i ] [to please my mother] [to wear e[i ]] . Wallace adopts Faraci (1974)’s idea that this is because RatCs attach to a higher level of syntactic projection than a PC. If RatCs attach to S and PCs to VP, then RatCs will necessarily have to be outside PCs in linear order. This fact can be accounted for without reliance on this stipulative assumption that these adjuncts subcategorize for different syntactic categories. If the second in a sequence of purposives must modify the adjunct psoa introduced by the first, then if we have a RatC which does not have an affected object in its psoa, the only combination possible will be for the RatC to follow the PC. If the PC followed the RatC, the constraint that the main clause psoa for the PC have an affected object would not be met. This explains the grammaticality of (68) in contrast to the ungrammaticality of (69). The ungrammaticality of (69) follows from the fact that there is no affected object in the RatC “to please my mother.” In (68), on the other hand, the PC comes first and its main clause psoa is the “buy” psoa which has an affected object, the thing bought. So then, we have the following parallel: just as in a sequence of two purposive “for” PPs the second can only have the recipient interpretation if the first has an affected object, in a sequence of infinitival expressions of purpose the second can only be a PC if the first contains an affected object. The affected object restriction is also relevant to combinations of “for” PPs with the recipient and acquire interpretations. The “for” PP in (70) can have the benefactive interpretation but not the recipient interpretation. (70) John worked for his family. (*recip) This effect is accounted for by the fact that there is not an affected object in the work psoa. However, in (71), when the first “for” PP has the acquire interpretation, the second “for” PP can have the recipient interpretation. (71) John worked for food for his family. (acquire, recip) This kind of example provides direct evidence that the second “for” PP must modify the purpose introduced by the first “for” PP. The first “for” PP introduces an acquire relation which has an object which is transferred, the thing acquired. This is an affected object so the condition required for the recipient interpretation is met. If the second “for” PP was directly modifying the “work”
94
. .
psoa, it would not be able to have a recipient interpretation because the psoa described by “work” does not have an affected object. As we would expect the order of the two “for” PPs cannot be reversed, as in (72). (72) *John killed for his family for food. (*recip, acquire) Since the purpose introduced by the first “for” PP will be modified by the second “for” PP, the “for” PP with an affected object must come first in linear order. 2.3.4 Future orientation Another criterion which groups PCs with the recipient interpretation of “for” PPs is that they are both future-oriented; that is, the time to which the adjunct psoa introduced by a PC or a recipient “for” PP is anchored is necessarily later than the time when the main clause psoa takes place. Bach (1982) points out that RatCs need not be future-oriented with respect to the time of the main clause psoa, as in the example in (73). (73) I bought it in order to use up my money. His point is that the main action, buying it, and the desired consequence, using up all the money, are cotemporaneous. By virtue of the fact that these modifiers specify a purpose there is no way for the adjunct psoa in the RatC to be in the past with respect to the main clause psoa. They are best characterized as nonpast; that is, they may be cotemporaneous or future-oriented with respect to the time of the main clause psoa. PCs on the other hand must be future-oriented with respect to the time of the main clause psoa. This makes sense if we assume that what PCs do is express the relation between an act of preparation of some sort, the main clause psoa, and the desired use of an object, the adjunct psoa. This explains the oddity of (74). (74) *John baked the cake– to brown. (PC) Even though the cake is an affected object the PC is cotemporaneous with the main clause “baking” psoa and thus not felicitous. In (75) on the other hand, the PC adjunct psoa is future-oriented with respect to the main clause psoa and thus the PC is fine. (75) John baked the cake to eat with dinner. (PC) If you want to express the fact that John baked the cake so that in the process of baking the cake would become brown a RatC has to be used and thus the gap must be filled by a pronoun, as in (76). (76) John baked the cake in order– to brown it. (RatC) The same distinction in temporal orientation extends to the distinction between recipient “for” PPs and benefactive “for” PPs. (77) John baked a cake for Mary.
A syntax and semantics for purposive adjuncts in HPSG
95
Under the interpretation where Mary receives the cake, it is necessarily after the baking that Mary receives the cake. On the reading where John intends to please Mary by baking the cake, Mary being pleased and John baking the cake can be cotemporaneous. 2.3.5 NP-modification A further parallel between “for” PPs and infinitival purposives is that they can both modify NPs. Furthermore, recipient “for” PPs and PCs group together in that they can modify object-denoting NPs, while the other “for” PPs and RatCs group together in that they can only modify event-denoting NPs. The purposive adjunct in these cases specifies that the object denoted by the NP was intended to be used for the purpose specified by the purposive adjunct. In (78), the adjunct “for Mary” indicates that Mary was the intended recipient of the book. In (79), the adjunct “for John to give to Mary” indicates that someone intended John to give the book to Mary. (78) The book for Mary is on the table. (recip) (79) The book for John to give– to Mary is on the table. (PC) Benefactive “for” PPs and RatCs and can only modify NPs which denote events. In (80), the adjunct “for Mary” indicates that the party was intended to please Mary. In (81), the adjunct “for freedom” indicates that the thing the participants in the battle intended to acquire was freedom. In (82), the adjunct “to find a cure” indicates that the purpose of the search was to find a cure. (80) [The party for Mary] is on Sunday. (please) (81) [The battle for freedom] cost many lives. (acquire) (82) [The search to find a cure] is taking too long. (RatC) Unlike PCs and recipient “for” PPs, other types of “for” PPs and RatCs are not felicitous as modifiers of object-denoting NPs. In (83), the adjunct “for Mary” can only have a recipient interpretation, it cannot have a benefactive interpretation, and the ungrammaticality of (84) shows that RatCs cannot modify object-denoting NPs. (83) The book for Mary is on the table. (84) *The book in order to give to Mary is on the table. (RatC) The interpretation of these NP-modifying purposives is addressed in section 3.6. 2.3.6 Summary In this section, I have shown that certain interpretations of “for” PPs are like infinitival purpose clauses in that they specify the purpose for which the state or event described by the main clause psoa was brought about. They share with purpose clauses the fact that an indefinite NP in the “for” phrase can have either a specific or a nonspecific reading, even when the interpretation
96
. .
of the clause is nongeneric. Furthermore, I have shown that there are a variety of different factors that distinguish recipient “for” PPs and PCs, which I group together as object-oriented purposives, from benefactive “for” PPs, acquire “for” PPs, and rationale clauses, which I group together as event-oriented purposives. The essential difference is that recipient “for” PPs and PCs both require there to be an affected object in the psoa described by the clause they modify. Recipient “for” PPs and PCs are also alike in that they are futureoriented, and they can modify object-denoting NPs. The other “for” PPs and RatCs can introduce psoas which are cotemporaneous with the main clause psoa and they can modify event-, but not object-, denoting NPs. The parallels between these different subtypes of purposive constructions are summarized in the chart in (85). (85)
PC
RatC
Object-oriented
Event-oriented
recipient “for” PP
benefactive “for” PP acquire “for” PP
Require an affected object
Don’t require an affected object
Cannot be stranded in pseudoclefts Can be stranded in pseudoclefts Cannot modify deep anaphora VP Can modify deep anaphora VP Future orientation
Non-past orientation
Modify object-denoting NPs
Modify event-denoting NPs
In the following section, I present an analysis of purposive adjuncts which captures the parallels between purposive “for” PPs and infinitival purposives and accounts for the different properties of these constructions addressed in this section.
3
An analysis of “for” PPs in HPSG
The theory of head-driven phrase structure grammar (HPSG) is particularly well suited to the analysis of phenomena such as purposive adjuncts because it facilitates representation of the interaction between syntactic and semantic information in these constructions. I adopt the version of the theory which is described in Pollard and Sag (1994). In the first two subsections below, I introduce the purpose relation between psoas, which I propose is common to the semantic representation of both “for” PPs and infinitival purposives, and discuss the sorting of psoa attributes, which I use to encode the restriction of object-oriented purposives to main clause psoas with an affected object.
A syntax and semantics for purposive adjuncts in HPSG
97
I then discuss the syntax of adjuncts in HPSG and present the analyses of purposive “for” PPs, infinitival purposives, and NP-modifying purposives in turn.
3.1
The semantics of purposives
I assume that the semantic representation of a purposive adjunct involves a psoa whose relation is purpose. This relation of purpose has two attributes whose values are both of the sort psoa: and . The value of is the psoa described by the clause the purposive adjunct modifies. The value of is the psoa described by the purposive adjunct. The attributevalue matrix (AVM) for this relation is as in (86). (86)
Rel
purpose
I assume that this relation is common to all of the purposive adjuncts, both “for” PPs and infinitival purposives. The semantics of the purpose relation are that the person responsible for bringing about the psoa intends that as a result the psoa should come about. The necessity of the relation which indicates the individual responsible for a situation is argued for by Farkas (1988). This relation will not be explicitly formalized in my analysis, but it is an implicit part of the interpretation of the proposed purpose relation. Reference to the notion of responsibility enables an account of examples with purpose clauses and no overt agent such as (87). (87) The gun is in the drawer for you to use in an emergency.
3.2
The sorting of attribute labels
I follow Gawron and Peters (1990) in deriving the names for the attributes of a relation from the name of the relation itself. For example, the attributes of the “bake” relation will be and . Developing on an extension of HPSG feature theory briefly entertained in Pollard and Sag (1994: 342f ), I assume that feature labels are themselves organized into a sortal hierarchy. This embodies the idea from thematic role theory that there are groups of shared properties held by the participants in different relations which are theoretically significant. The difference from the standard conception of role theory is that the attributes of a particular relation can be as specific as necessary. The shared properties are the result of the formulation of a sort hierarchy of participant attribute labels. To distinguish them from feature structure sorts, these sorted labels are indicated by italicized capitals. For example there will be a sort AGENT, of which the attributes , , , and all other attributes corresponding to agentive participants in
98
. .
relations will be subsorts. The sort AFFOBJ is used to identify the presence of an affected object in a psoa. , , , and all the other attributes corresponding to affected objects will be subsorts of this sort. An affected object is one which is created, transferred, or transformed in some way by the action described in the psoa in which it appears. In the following analysis, the restriction on object-oriented purposives will be encoded as the requirement that there be an attribute of the sort AFFOBJ in the main clause psoa.
3.3
The syntax of adjuncts in HPSG
I assume, in keeping with the version of the theory given in Pollard and Sag (1994), that adjuncts, such as purposive adjuncts, select for their heads rather than vice-versa. The combination of an adjunct and a head is mediated by a subsort of con-struc, namely head-adj-struc. The schema is defined as in (88). (88) Head-adjunct schema: A head-adjunct structure is a DTRS value with two attributes: - and -. This structure is constrained in that the value of the - is token-identical to the value of the -. The value of an adjunct is an attribute of the value of . The headadjunct schema can be given in AVM form as in (89). (89)
- [ [1]] - [| [ [ [ [1]]]]]
The index [1] indicates the required token identity between the value of the adjunct and the value of the head. The combinatory properties of an adjunct are specified within its attribute. These combinatory properties constrain the type of head that the adjunct can combine with. The semantics principle, as in (90), ensures that the semantics of the head-adjunct combination are those of the adjunct daughter. I am not going to consider quantifier retrieval in this paper so I will use the second version of the Semantics principle given in Pollard and Sag (1994). (90) Semantics Principle: In a headed phrase, the value is tokenidentical to that of the adjunct daughter if the value is of sort head-adj-struc, and with that of the head daughter otherwise.
3.4
The analysis of “for” PPs
The interesting fact about “for” PP purposives is that they appear to perform very much the same function as infinitival purposives. In order to capture this fact I have proposed that both “for” PPs and infinitival purposives introduce a purpose relation which takes two psoa-valued arguments. In the case of
A syntax and semantics for purposive adjuncts in HPSG
99
infinitival purposives, it is clear where these two psoas come from. The psoa is that described by the main clause, the clause the adjunct modifies. The psoa is that described by the clause in the infinitival purposive. In the case of “for” PPs, it is not immediately clear how the appropriate psoa is identified since all “for” has as a complement is an NP. I propose that both the psoa and the purpose relation are part of the lexical semantics of the preposition “for.” The recipient interpretation of a “for” PP comes from the recipient lexical entry for “for,” as in (91). (91) The recipient sense of “for” 〈for〉
[subcat 〈 〉] [3] | [ AFFOBJ [2]] 〈NP[1]〉 | Rel purpose [3] Rel receive [4] [1] [2] [4]
The tag [1] which appears after the attribute and on the NP that “for” subcategorizes for indicates that the inherits the semantics of the complement of “for.” This represents that fact that the entity denoted by the complement of “for” is the recipient. The function of the tag [2] is to ensure that the semantics of the is linked to the semantics of the affected object in the main clause psoa. The AFFOBJ specification must unify with some attribute in the main clause psoa. I assume that if a sorted attribute like this does not get unified with some attribute the representation is ill-formed. In order to access the AFFOBJ in the main clause psoa, the | does not make reference to the of the modified phrase directly, rather it accesses a new feature introduced here within . This is necessary in order to account for the semantics of constructions with more than one “for” PP. The feature in the modified phrase is structure-shared with the psoa which further adjuncts can modify. In a basic clause, it is structure-shared with the feature. If the main is a purpose relation, is structure-shared with the feature. This is indicated by the tag [4] in (91).8 In order for this analysis to work, in addition to , the feature must be inherited 8
The introduction of the feature avoids the need for a disjunctive feature path in | referring either to or |.
100
. .
by the resulting phrase in a head-adj-struc. The function of the tag [3] is to ensure that the attribute of the purpose relation inherits the semantics of the main clause psoa. The psoa is the psoa described by the main clause, the clause being modified. The of an NP is usually a attribute which contains an index and a attribute which contains a list of restrictions on the index. For ease of exposition here I will make the assumption that NPs have a simpler semantics. I assume for the moment that the of “Mary” is mary ′ and the of “a cake” is a– cake′. The signs for “Mary” and “a cake” will be as in (92) and (93). (92)
〈Mary〉 |
(93)
〈a cake〉 |
N 〈 〉 mary'
N 〈 〉 a– cake'
I will use the more complex NP-semantics later in the discussion of NP modification. The sign of the “for” PP “for Mary” is like (91) except that the list is empty and the semantics of the NP “Mary” are unified with the value of the attribute. The resulting sign is as in (94). (94) The sign for the recipient interpretation of “for Mary” 〈for Mary〉 [ 〈 〉] [3] | [AFFOBJ [2]] 〈 〉 | Rel purpose [3] Rel receive [4] mary' [2] [4] I assume that the semantics of the benefactive sense of “for” are the same except that the psoa which is the value of the attribute contains a please relation with one attribute: the . There is no requirement that there be an affected object in the semantics of the main clause psoa. This type of “for” PP interpretation results from the lexical entry for “for” in (95).
A syntax and semantics for purposive adjuncts in HPSG
(95) The benefactive sense of “for” 〈 for〉
|
101
[ 〈 〉] [2]
〈NP [1]〉 Rel purpose | [2] please [3] Rel [1] [3]
The tag [2] indicates that the main clause psoa is inherited by the attribute in the purpose psoa. The tag [1] indicates that the semantic contribution of the complement of “for” is inherited by the attribute. The acquire sense of “for” will be the same but the value of the attribute has an acquire relation with the attributes and , as in (96). (96) The acquire sense of “for” 〈 for〉
[ 〈 〉] [3] | [ [1]] 〈NP [2]〉 | Rel purpose [3] Rel acquire [4] [1] [2] [4]
The object of acquire “for” is the , as indicated by the tag [2]. The is the agent of the main clause psoa as indicated by the tag [1]. Once again, is used to allow for expressions with multiple purposive adjuncts. Having introduced the various lexical entries for purposive “for” PPs and shown how they combine with a complement, I now illustrate the process of combining a “for” PP with a clause, using “John bakes a cake for Mary” as an example. I first consider the recipient sense of “for,” and assume that the sign for “John bakes a cake” is to be analyzed as follows (with the attribute suppressed for ease of exposition). is shared directly with .
102
(97)
. .
〈John bakes a cake〉 V 〈 〉 Rel bake | [1] john' a–cake' [1]
There is a possible head-adj-struc which has the sign for “John bakes a cake” as its head daughter and the sign for the recipient interpretation of “for Mary” as its adjunct daughter, as in (98). (98)
〈John bakes a cake for Mary〉 [1][ V ] Rel purpose Rel bake [5] john' | [2] [3]a–cake' Rel receive [6] mary' [3] [6] 〈John bakes a cake〉 [1] - [4] [5] [5] 〈 for Mary〉 || [4] - [2] | [6]
The tag [4] is from the head-adj-struc in (89). It is this tag which indicates that the value of the adjunct daughter unifies with the value of the head daughter. When this schema applies, this structure-sharing results in the value of within the feature of the adjunct unifying with the feature and therefore the in the -. The result is structureshared into the feature of the -. The requirement that there be an affected object in the main clause psoa is met by the attribute. The [3] tag on the AFFOBJ specification is shared with the attribute in the psoa. As a result, the affected object, the object, which is the cake, is structure-shared with the value of the psoa. As a result of the semantics principle, the value of the resulting structure is the value of the adjunct. The semantic representation of the main clause psoa is inherited as the value of the attribute in the purpose psoa
A syntax and semantics for purposive adjuncts in HPSG
103
through the tag [5]. I assume that the composition of the benefactive and acquire “for” PPs with a clause proceeds in much the same way. I turn now to consider cases where there is more than one “for” PP. 3.4.1 Sequences of “for” PPs I assume that in an example such as (99), the “for” PP “for Mary” is in a headadj-struc as described above, and, further, that that structure in turn is the head daughter of yet another head-adj-struc object in which “for her kids” is the adjunct. (99) John baked a cake for Mary for her kids. Examples of this form are what motivate the use of the feature in the analysis of “for” PPs. The feature provides access to the psoa which a purposive adjunct should modify. In the case of an unmodified clause, this is the value. In the case of a clause modified by a purposive “for” PP, it is the feature within the purpose relation which is structure-shared with . Purposive “for” PPs differ from infinitival purposives in that there is no clause within the purposive that further adjuncts can modify. It is this which necessitates the use of the feature to access the last psoa within the expression which is being modified. In (99), when the second “for” PP has a recipient interpretation, the structure-sharing in the head-adj-struc will lead to the structure-sharing of the semantics of “John baked a cake for Mary” with the attribute of the purpose relation associated with the second adjunct. The attribute in the value of in the topmost psoa will be unified with the attribute in the lower psoa and that in turn is unified with the semantics of “a cake.” The AFFOBJ requirement of the second adjunct is met as long as we assume that is a subsort of AFFOBJ. The resulting value for (99) will be as in (100). (100) Rel
purpose Rel
purpose Rel bake john' [1]a–cake' Rel receive mary' [1] Rel receive her–kids' [1]
In this fashion the analysis accounts for the fact that sequences of clausemodifying “for” PP adjuncts are interpreted recursively. The fact that this example can also mean that John baked a cake with the intention that Mary receive it and that as a result of her receiving it the kids are pleased also receives an account under this analysis. I assume that this latter sense of the example
104
. .
involves a recipient interpretation of the first “for” PP and a benefactive interpretation of the second “for” PP. The resulting psoa will be as in (101). (101) Rel
purpose Rel
purpose Rel bake john' [1]a–cake' Rel receive mary' [1] Rel please her–kids'
The reading where both have a benefactive interpretation will also be possible. The psoa for that reading will have a please relation in both psoas. I observed earlier that there is no way for the first “for” PP to have a benefactive interpretation and the second to have a recipient interpretation. This follows from the analysis here, since given the form of the lexical entries corresponding to the different senses of for, the second “for” PP has to modify the psoa corresponding to the combination of the main clause and first adjunct. If that first adjunct has a benefactive interpretation, the resulting psoa will have a purpose attribute whose relation is the please relation. If the second “for” PP has a recipient interpretation it will require that the psoa which is the value of have an attribute of the sort AFFOBJ. There is no attribute of the sort AFFOBJ in the please relation psoa and thus the recipient interpretation will not be available for the second “for” PP. The fact that a recipient “for” PP can follow an acquire “for” PP again follows from the analysis given here. The recipient “for” PP modifies the acquire psoa, and is an attribute of the sort AFFOBJ. The semantic representation for an example like (102) would be as in (103). (102) John worked for food for his family. (103) Rel
purpose Rel
purpose Rel work [1] john' Rel acquire [1] [2] food Rel receive his–family' [2]
I return now to examples with pseudoclefts and VP-anaphora.
A syntax and semantics for purposive adjuncts in HPSG
105
3.4.2 Pseudoclefts and VP-anaphora The facts to be accounted for are the impossibility of a recipient interpretation of the “for” PP in (104) and (106). (104) (105) (106) (107)
What [ John did] for Mary was buy some candy. (*recip) What John did was buy some candy for Mary. (recip) John bought cookies and [he did it] for Mary. (*recip) John bought cookies and he bought them for Mary. (recip)
I assume that the verb “do” introduces a psoa whose relation is do and which has the attributes and . The expressions in square brackets in the examples above will have the semantic representations in (108) and (109). (108) Rel do john' [1] (109) Rel do he' it' [1] In each case, the tag [1] is meant to indicate that the value of the attribute is structure-shared with the semantics of the VPs “buy candy” and “bought cookies” respectively. The semantics of these VPs does in fact contain an attribute of the sort AFFOBJ but, crucially, the recipient interpretation of “for” requires that there be an attribute of the sort AFFOBJ in the psoa denoted by the expression it directly modifies. In both (104) and (106) the “for” PP modifies a “do” psoa which does not have an attribute of the sort AFFOBJ and therefore the recipient interpretation is impossible. 3.4.3 Summary In this section I have introduced the syntactic and semantic representations which are assigned to the purposive interpretations of a “for” PP. I have shown how the restriction of the recipient interpretation to modifying clauses with an affected object can be captured by assuming that the main clause psoa must have an attribute of the sort AFFOBJ. The analysis accounts for the recursive interpretation of clause modifying “for” PPs by allowing a purpose psoa to be the psoa of another purpose psoa. This aspect of the analysis interacts with the restriction on the recipient interpretation that there has to be an affected object to explain the restrictions on the interpretations of a sequence of “for” PPs and the pseudocleft and deep VP-anaphora facts. I return now to infinitival purposives and show that they have the same basic analysis as purposive “for” PPs.
106
. .
3.5
The analysis of infinitival purposives
My goal in this section is to show how the external syntax and semantics of PCs and RatCs is like that of purposive “for” PPs. My concern is solely with the the function of PCs and RatCs as units which modify clauses, and I do not attempt to capture the internal syntax of PCs and RatCs. To do so would involve a fully-fledged analysis of control and gapping and is beyond the scope of this paper, though some observations about internal syntax of purpose adjuncts appear in the discussion below.9 I propose that, just as in the case of “for” PPs, the content of a PC or RatC specifies the intentions of the person responsible for the psoa that is modified. This observation is captured by assuming that infinitival purposives are clausal adjuncts whose semantic contribution is a psoa conveying a purpose relation, which takes two attributes whose values are of the sort psoa: and . The value of is the psoa described by the purpose clause and the value of is the main clause psoa. The difference between a RatC and a PC is that, like the recipient “for” PP, the PC requires that there be an attribute in the main clause psoa which is of the sort AFFOBJ. This will be specified in the value just as in the sign for the recipient “for” PP. The PC “for Sue to give to Peter” will have the sign in (110). (110) 〈 for Sue to give to Peter〉 [ 〈 〉] | [3] [AFFOBJ [2]] 〈 〉 Rel purpose | [3] Rel give sue' [4] [2] peter' [4] It is not necessary to access the AFFOBJ through for examples with more than one infinitival purposive. Such cases can be accounted for by the second infinitival purposive being a modifier of the clause embedded within the first infinitival purposive. However, if we are to account for examples such as (111) in which an infinitival purposive modifies a clause modified by a purposive “for” PP we need to use , since as in the case of multiple “for” PPs, there 9
Jones (1991: ch. 3) gives one perspective on the internal syntax of purpose clauses. Green (1991) and Cherny (1991) are both analyses of infinitival purposives in HPSG. The gapping could presumably be accounted for using features (See Pollard and Sag 1994: ch. 4).
A syntax and semantics for purposive adjuncts in HPSG
107
is no clause embedded in the first adjunct for the second adjunct to modify directly. I return to this example below. (111) The administrators bought it for linguists for them to sort data with. When the sign in (110) combines with “John baked a cake,” the tag [2] ensures that the value will become shared with the attribute , which is of the sort AFFOBJ. The resulting semantics will be as in (112). (112) Rel
purpose Rel Rel
bake john' [2]a–cake' give sue' [2] peter'
I am assuming that the external syntax and semantics of a RatC differ only in that there is no requirement that there be an affected object in the main clause psoa. The RatC “in order to please Mary” will have the sign given in (113). (113) 〈in order to please Mary〉 | 〈 〉 Rel purpose | [1] Rel [3] [3]
[ 〈 〉] [1] [ [2]]
please [2] mary'
The sign in (113) will be licensed as an - in a head-adj-struc in just the same way as a benefactive “for” PP. The main clause psoa is unified into the value of in the purpose relation through tag [1]. For example, the semantic content of “John baked a cake in order to please Mary” will be as in (114). (114) Rel
purpose Rel bake [2] john' a–cake' Rel please [2] mary'
108
. .
The tag [2] ensures that the agent of the psoa is the agent of the psoa which is the value of . This is an oversimplification since it is not always the agent of the main clause that controls the subject gap, but there is not space to investigate this issue further in this paper. This analysis captures the various other facts that are shared between “for” PPs and infinitival expressions of purpose. The impossibility of PCs modifying a “do” psoa in pseudoclefts and deep VP-anaphora is the result of the AFFOBJ constraint requirement on the psoa. The restrictions on possible sequences of PCs and RatCs follow from the same constraint. The analysis also accounts for the fact that if there is a sequence of a “for” PP and a PC modifying a clause, the “for” PP has to have the recipient interpretation. In example (111) above, “for linguists for them to sort data with” is such a sequence. The infinitival purposive is clearly a PC since it has a gap after “with.” For this example to be felicitous, “for linguists” must have a recipient interpretation. This follows from my analysis since the “for” PP will contribute a purpose relation of which the main clause psoa is an argument. The PC “for them to sort data with” has to modify that purpose psoa and cannot directly modify the main clause psoa. It requires that there be an attribute of the sort AFFOBJ in the psoa which is the value of . The only sense of the “for” PP which has an attribute of the sort AFFOBJ is the recipient sense and thus in order for this example to be felicitous the “for” PP can only have the recipient interpretation. The resulting semantic representation will be as in (115). I will not address the issue of the reference of “them” here. (115) Rel
purpose Rel
purpose Rel Rel Rel sort them' data' [1]
buy the– administrators' [1]it' receive linguists' [1]
3.5.1 Decisions about the internal syntax While I have not given a formal account of the internal syntax of PCs and RatCs, I would like to make some general comments about the problems that are involved. Two crucial issues are the obligatorily controlled gap in PC and the issue of where in PC and RatC the purposive meaning hangs.
A syntax and semantics for purposive adjuncts in HPSG
109
PCs require a gap corresponding to the affected object in the main clause psoa while RatCs cannot have such a gap; rather, they can have only a subject gap, controlled by the person responsible for the main clause psoa. This difference is clearly related to the object-oriented vs event-oriented distinction. The obligatory gap in a PC is the way in which English chooses to indicate that the PC is object-oriented. RatCs, on the other hand, simply express a relation between an event and an intended result. The existence of a shared object is simply coincidental in those cases. The second issue concerns how the signs given above for a PC and a RatC are composed. In a “for” PP the preposition “for” is always present so we can assume that it is the source of the purposive meaning. However, in PC and RatC, the “for” only appears if there is an overt subject in the PC or RatC, as shown in (116) to (119). (116) (117) (118) (119)
John brought it for Mary to give to Peter. (PC with “for”) John brought it to give to Peter. (PC without “for”) John brought it for Mary to practice on it. (PC with “for”) John brought it to practice on it. (RatC without “for”)
One possibility would be to assume that if “for” is absent there is an empty operator which contributes the purposive meaning, a solution which however does not fit well within the HPSG framework. An alternative that the framework allows would be to assume that there is a phrase structure schema especially for the expression of purpose. The fact that the psoa described in the PC or RatC becomes the value of in the main clause psoa could be stated as a restriction on the schema. The account could be extended to cover the “for” PPs as well. Further development of this idea remains as a direction for further work.
3.6 NP-modifying purposives Both recipient “for” PPs and PCs can function as post-nominal modifiers. Examples are “for Mary” in (120) and “for Mary to meet” in (121). (120) [The book for Mary ] is on the table. (121) [The man for Mary to meet] just walked in. These modifiers specify the purpose to which the object denoted by the NP which they modify is to be put. In (120), the expression “for Mary” contributes the fact that Mary is the intended recipient of the book. In (121), the expression “for Mary to meet” contributes the fact that “the man” is someone that Mary is intended to meet. This section examines the properties of these constructions in more detail. I will first show how the NP-modifying object-oriented adjuncts can be distinguished from their clause-modifying counterparts. I will then extend the formal analysis to account for the semantics of object-oriented NP-modifiers. I then turn to the event-oriented purposives and develop an account of their function as NP-modifiers of event-denoting NPs.
110
. .
3.6.1 Distinguishing NP-modifiers from VP-modifiers It is very important for us to be able to distinguish purposive NP-modifiers from purposive clausal modifiers because examples where the modified NP is in object position can be ambiguous. For example, (122) has two possible readings which are described in (A) and (B). (122) John bought the cake for Mary. (A) (B)
John bought the cake with the intention that as a result Mary would receive the cake. (This is the clause modifier reading). Mary’s receiving the cake is not related to John’s buying the cake. He does not intend Mary to receive the cake in fact he may not even know that Mary is meant to receive the cake (This is NP-modifier reading).
There is a similar ambiguity in (123). (123) John bought the cake to give to Mary. (PC) One way to disambiguate this sort of example is to add the adverb “accidentally” to the clause. In (124), the adverb “accidentally” contributes the fact that John’s buying of that particular cake was not intentional and thus the occurrence of a purposive clause modifier is not licensed. Thus in (124), the (B) reading, which corresponds to NP attachment of the adjunct, is the only one available. (124) John accidentally bought [the cake for Mary] and gave it to Joan. Similarly, the ambiguity of (123) is not present in (125). (125) John accidentally bought [the cake to give to Mary] and gave it to Joan. It is also possible to isolate the different meanings by altering the syntactic context. The clause modifier reading can be isolated by making the NP one which does not take modifiers, such as a pronoun or a proper name as in (126) and (127), preceding it with a finite relative clause, as in (128), or ensuring that there is another complement intervening, as in (129). (126) (127) (128) (129)
John bought it for Mary. John bought Little Sally for Mary. John bought some candy that was cheap for Mary. John gave the book to Sally for Mary.
Similarly, the PC “to give to Mary” can only have the clause modifier reading in the parallel examples (130) to (133). (130) (131) (132) (133)
John bought it to give to Mary. John bought Little Sally to give to Mary. John bought some candy that was cheap to give to Mary. John gave the book to Sally to give to Mary.
A syntax and semantics for purposive adjuncts in HPSG
111
The NP-modifier reading can be isolated by getting the NP into subject position. This is the case in passives like (134) and (135) and with the verbs “have” and “be,” as in examples (136) through (139). (134) (135) (136) (137) (138) (139)
[The cake for Mary] was bought by John. [The cake to give to Mary] was bought by John. [The cake for Mary] is on the table. [The cake to give to Mary] is on the table. [The cake for Mary] has red icing. [The cake to give to Mary] has red icing.
Another syntactic difference is that the complement of “for” can only be questioned when the “for” PP is a clause modifier. This is because it is not possible to extract out of an adjunct to the head of an NP to form a question. (140) Who did John bake a cake for? Having demonstrated how NP-modifying purposives can be distinguished from their clause-modifying counterparts both syntactically and semantically, I now show how the analysis of clause-modifying purposives can be extended to account for their NP-modifying counterparts. 3.6.2 The analysis of object-oriented purposive NP-modifiers I will give an analysis here of an NP-modifying recipient “for” PPs. The same analysis can be extended to account for true purpose clauses (PCs). I will only present the analysis of “for” PPs here though. The semantic contribution of a recipient “for” PP in an example like (141) is that the object denoted by the NP it modifies was intended to be received by Mary. The individual who intended Mary to receive the cake is not specified. (141) [The cake for Mary ] is on the table. This sort of example shows that there is no necessary relation between the main clause psoa of the clause in which the NP appears and the intention that Mary receive the cake. This analysis is going to make use of the full version of NP semantics in HPSG rather than the abbreviated version that I have been assuming so far. Up until now I have assumed for ease of presentation that NPs contribute atomic semantics like a–cake′ and john′. Pollard and Sag (1994) propose that the value of an NP consists of two attributes: () and (). The value of structure of sort parameter can be thought of as the HPSG analog of a reference marker in DRT (cf. Kamp (1981)). It contains an index and person, number, and gender information. I will suppress the latter here and only give the index. The attribute contains a set of psoas. This set places conditions on the entities that the parameters appearing in them can be anchored to. I am not going to consider the semantics of determiners here for ease of exposition. The sign for “the cake” will be as in (142).
112
. .
(142) 〈the cake〉 |
N 〈 〉 [1] cake Rel [1]
An utterance of “the cake for Mary” adds an index with the restriction that the entity the index becomes anchored to is an instance of a cake, just as “the cake” does. The contribution of “for Mary” is to place the further restriction on the index that it be an object which some agent intended Mary to receive. I propose that this be captured by adding another psoa to the value. This psoa is just like the one that is in the semantics of a clause-modifying purposive adjunct. It is a purpose relation with two psoa-valued attributes: and . The psoa is that Mary receive the object that is denoted by the NP. The value of is underspecified other than that there is a requirement that it contain an affected object. The desired sign is as in (143). I will not address the issue of the reference of “them” here. (143) 〈the cake for Mary〉 | N 〈 〉 [1] Rel Rel
cake , [1] purpose [ AFFOBJ [1]] Rel receive mary' [1]
To grasp the thrust of this description, one should think in terms of the satisfaction conditions on determining whether a particular entity is “the cake for Mary.” The discourse model can be thought of as a set of AVMs which represent the properties of all the individuals in the discourse and all the things that have taken place. The process of identifying that a particular entity was “for Mary” involves searching through the model looking for a psoa which unifies with the second psoa in the above in (143). I will assume that the method of combination of a “for” PP NP-modifier and the NP it modifies is exactly the same as the combination a clause-modifying “for” PP with a clause. The two both fit into a head-adj-struc. I propose that this NP-modifying recipient sense of “for” has the sign in (144) as its lexical entry.
A syntax and semantics for purposive adjuncts in HPSG
113
(144) 〈 for〉
N 〈 〉 | [2] [3] 〈NP [1]〉 [2] | Rel purpose [AFFOBJ [2]] [3]< Rel receive [1] [2]
The tag [1] indicates the fact that the complement of “for” describes the entity which is intended to receive the object denoted by the NP. When this sign is the adjunct daughter in a head-adj-struc, the value will be unified with the value of the head daughter. As a result, the tag [2] is unified with the in the of the NP and the set of restrictions on the NP are unified with [3]. The resulting semantics of the whole adjunct and head combination will be the value of the adjunct, in accordance with the semantics principle. This will have the tag [2] and a which is the result of the union of the restrictions on the NP ([3]) and a purpose psoa whose attribute is a psoa with the receive relation, in which the attribute has the tag [2]. This analysis predicts that there is no syntactic control involved in an NP-modifying purpose adjunct. The only way to work out whose intentions are involved is to look back in the discourse model. The analysis I have presented is further supported by certain other differences between the interpretation of sequences of “for” PPs in an NP as opposed to sequences of “for” PPs modifying a clause. I have already shown that in a sequence of recipient “for” PP clause-modifying adjuncts the second necessarily modifies the first. For example, in the interpretation of (145) there is a necessary order of possession of “it” which can be diagrammed as in (146). (145) John bought it for Mary for her kids. (146) John → Mary → her kids This is accounted for by my analysis, since the second “for” PP has to modify the purpose psoa contributed by the first. The interesting difference with NP-modifiers is that this ordering is not required. For example, in the interpretation of (147), the intended order of possession of the cake is not fixed. The cake may be to be given to Mary so that she then gives it to her kids, or the cake may be an object intended for Mary which is given first to her kids so that they give it to her. These two readings can be diagrammed as in (148) and (149).
114
. .
(147) [The cake for Mary for her kids] is on the table. (148) → Mary → her kids (149) → her kids → Mary The availability of this range of interpretations is expected under my analysis. Each psoa will be added to the in turn. It is not necessary that the second “for” PP modify the first because they are added into a set of unordered conditions on a referring index. I now want to consider the analysis of examples where acquire and benefactive and “for” PPs and RatCs function as NP-modifiers of event-denoting NPs. 3.6.3 Event-oriented NP-modifiers I have characterized acquire “for” PPs, benefactive “for” PPs, and RatCs as event-oriented NP-modifiers. They share the property that unlike recipient “for” PPs and PCs they cannot modify NPs which denote entities.10 This is most clearly seen in examples with RatCs, as in (150) and (151). (150) *[The cake (in order) for John to eat it] is on the table. (151) *[The package (in order) for Bill to collect it] is at the post office. The event-oriented modifiers can however modify NPs which denote intentional actions, as shown in (152), (153), and (154). (152) [The party for Mary] kept her smiling all week. (benef ) (153) [Our country’s battle for basic human rights] still goes on. (acquire) (154) [The preliminary investigation (in order) to find plausible suspects] attracted a lot of media attention. (RatC) If we assume that the values of these NPs are psoas then these examples can be given an analysis exactly parallel to clause modification by event-oriented purposives. The sign for an event-oriented purposive only requires that the head modified have an empty list and that its be a psoa with an AGENT attribute. There is no specification of the syntactic category of the head it modifies. If an NP denotes a psoa it can also be modified by an adjunct with the same sign as a clause-modifying event-oriented purposive adjunct. 10
There are some cases in which these senses of “for” PPs appear to be modifying an entity denoting NP: (i) The sticky strip for foolish insects was a very popular feature of our product (acquire). (ii) The blue icing for Mary was put on by John (benefactive). It may be that these can be assimilated to the recipient interpretation. This remains as a direction for further work.
A syntax and semantics for purposive adjuncts in HPSG
115
3.6.4 Summary In this section, I have shown how the analysis of clause-modifying purposive adjuncts which I have proposed can be extended to account for the properties of NP-modifying purposive adjuncts. They involve the same purpose relation and they can appear together in accordance with the phrase structure schema head-adj-struc. The difference is that the purpose psoa introduced by the adjunct becomes one of the restrictions on the index associated with the parameter introduced by the NP. This analysis accounts for the fact that clause modifying adjuncts are necessarily interpreted recursively in accordance with their order of appearance while NP-modifying adjuncts do not have to be interpreted recursively. Event-oriented purposives can also modify NPs, so long as their semantic representation is a psoa with an agent.
4
Conclusions
There are three different purposive senses of a “for” PP: the recipient, benefactive, and acquire senses. These introduce psoas in which the relations are receive, please, and acquire respectively. Like infinitival purposives, this subset of the uses of a “for” PP function to specify the intentions of the agent who brought about the main clause psoa, the clause they modify. I analyze both purposive “for” PPs and infinitival purposives as purposive adjuncts, a class which is divided into two subgroups: object-oriented purposives and eventoriented purposives. The object-oriented purposives are the recipient “for” PPs and true purpose clauses (PCs). They share the property that they require the psoa they modify to have an attribute of the sort AFFOBJ. They also share the property of being future-oriented and can both modify object-denoting NPs. The event-oriented purposives are the benefactive and acquire senses of a “for” PP and rationale clauses (RatCs). They share the property of not requiring that there is an affected object in the psoa described by the clause they modify. They are nonpast as opposed to future-oriented and they modify NPs which denote psoas, but not object-denoting NPs. An analysis has been formulated in head-driven phrase structure grammar. Both NP and clause-modifying adjuncts are captured within the headadj-struc phrase structure schema proposed in Pollard and Sag (1994). The similarity of meaning between purposive “for” PPs and infinitival purposives is captured by assuming that they both introduce a purpose relation which has two psoa-valued attributes. The first of these is labeled and is structure shared with the of the clause modified. If the adjunct is a PC or a recipient “for” PP there must be an attribute of the sort AFFOBJ in the psoa. This restriction accounts for the fact that object-oriented purposives cannot appear in pseudocleft constructions or modifying “do it” in examples with deep VP-anaphora. The psoa is, in the case of the infinitival purposives, the psoa described by the clause in the infinitival purposive. In the case of the “for”
116
. .
PPs, the psoa comes from the lexical semantics of “for.” The analysis accounts for the fact that when there is more than one purposive adjunct they are necessarily interpreted recursively. The second purposive adjunct has to modify the psoa introduced by the first purposive adjunct and so on. Along with the restriction of object-oriented purposives to psoas with an affected object, this accounts for the restrictions on the interpretations of sequences of purposive adjuncts. The fact that NP-modifying purposives do not have to be interpreted recursively follows from the assumption that each NP-modifying purposive contributes a separate restriction to the parameter index of the NP that it modifies. The conception of syntax and semantics in head-driven phrase structure grammar is particularly well suited to this sort of analysis. Two properties that were critical in this analysis are the parallel representation of syntactic and semantic information and the structured nature of the semantic representation. I now want to consider some possible directions for further work. The first of these is the explanation of multiple interpretations of a “for” PP.
4.1
Pragmatic rules for multiple interpretations
I believe that the proper analysis of the semantics of “for” PPs has been muddied by the operation of certain pragmatic rules. An example like (155) can have an interpretation of the “for” PP in which Mary is not only the intended recipient of the cake but also John intends to please Mary. (155) John baked a cake for Mary. The appearance of this apparent multiple interpretation of a “for” PP can be reconciled with the analysis I have proposed here if there is the general rule of pragmatic interpretation given in (156). (156) Benefactive Interpretation Rule: If the purpose for an agent’s action involves some sentient being or group of beings assume as default that the agent intends to please that being or group of beings by performing the action. The multiple interpretation will arise when “for Mary” has the recipient sense. Since the purpose of John’s action involves a person, namely Mary, the benefactive interpretation will arise by default. The further investigation of the relevance of pragmatic rules of this sort remains as a direction for further work.
4.2
Further applications
There is a wide range of possible further applications of the analysis proposed here. The proposal that attributes of a psoa be sorted will allow the formulation of a more elegant and structured account of semantic roles. This
A syntax and semantics for purposive adjuncts in HPSG
117
may be tied in with the notion of sorted relations to account for the relations between different types of relations and sorts of participants they have. The analysis of “for” PP adjuncts in terms of head-adj-struc should be extendable to account for other adjuncts such as “because” clauses, locatives, temporals, comitatives, and instrumentals. Another long-term goal is to examine the expression of purpose in other languages in order to identify the universal aspects of the expression of purpose and what is language specific.
References Allerton, D. J. 1982. Valency and the English Verb. New York: Academic Press. Bach, Emmon. 1982. Purpose clauses and control. In The Nature of Syntactic Representation, ed. P. Jacobson and G. K. Pullum. Dordrecht: Reidel. 35–57. Chafe, Wallace. 1970. Meaning and the Structure of Language. Chicago: University of Chicago Press. Cherny, Lynn. 1991. Purpose Clauses in HPSG. MS. Stanford University. Faraci, Robert. 1974. Aspects of the Grammar of Infinitives and “for” Phrases. Ph.D. dissertation. Massachusetts Institute of Technology. Farkas, Donka. 1988. On Obligatory Control. Linguistics and Philosophy 11: 27–58. Fillmore, Charles. 1968. The case for case. In Universals in Linguistic Theory, ed. Emmon Bach and Robert T. Harms. New York: Holt, Rinehart and Winston Inc. 1– 88. Gawron, Mark. 1986. Situations and prepositions. Linguistics and Philosophy 9: 327–382. Gawron, Mark and Stanley Peters. 1990. Anaphora and Quantification in Situation Semantics. CSLI Lecture Notes Series no. 19. Stanford: Center for the Study of Language and Information. Green, Georgia. 1974. Semantic and Syntactic Regularity. Bloomington: Indiana University Press. Green, Georgia. 1991. Purpose Clauses and Their Relatives. MS. University of Illinois Urbana-Champaign. Halliday, M. A. K. 1971. Language structure and language function. In New Horizons in Linguistics, ed. John Lyons. Harmondsworth: Penguin. 140–165. Hankamer, Jorge and Ivan Sag. 1976. Deep and surface anaphora. Linguistic Inquiry 7.3: 391– 426. Jones, Charles. 1985. Agent and Patient and control into purpose clauses. In Proceedings of the Parasession on Causatives and Agentivity. Chicago Linguitics Society 21. 105–119. Jones, Charles. 1991. Purpose Clauses. Studies in Linguistics and Philosophy. Dordrecht: Kluwer. Kamp, Hans. 1981. A theory of truth and semantic representation. In Formal Methods in the Study of Language, ed. Jeroen Groenendijk, Theo. M. Janssen, and Martin Stokhof, vol. I. Amsterdam: Mathematische Centrum. Platt, H. 1971. Grammatical Form and Grammatical Meaning: A Tagmemic View of Fillmore’s Deep Structure Case Concepts. North Holland Linguistics Series 5. Amsterdam: North-Holland. Pollard, Carl and Ivan Sag. 1987. Information-based syntax and semantics, vol. 1: Fundamentals. CSLI Lecture Notes Series no.13. Stanford: Center for the Study of Language and Information.
118
. .
Pollard, Carl and Ivan Sag. 1994. Head-Driven Phrase Structure Grammar. Chicago: University of Chicago Press and Stanford: Center for the Study of Language and Information. Somers, H. L. 1987. Valency and Case in Computational Linguistics. Edinburgh: Edinburgh University Press. Wallace, Karen. K. 1986. Infinitival Purpose Constructions in English Syntax. Master’s thesis, University of California at Los Angeles.
3. On lexicalist treatments of Japanese causatives1 Osaka University
1
Introduction
The causative construction, along with other “complex predicate” constructions, has been analyzed in one of two ways: an analysis in which the causative sase is treated as a verb that is a sister of the verb phrase headed by the verb, and an analysis that supposes that the causative is morphologically a suffix and that the sequence consisting of a verb and the causative suffix is equivalent to a lexical verb. While many pieces of the syntactic and semantic evidence seem to favor the former analysis, morphological and phonological properties clearly suggest that a verb stem followed by the causative is indeed a word. In addition, interaction with some lexical processes seems to be best explained by the latter approach. In classical transformational analyses, this apparent dilemma was solved by postulating a biclausal structure in a structure different from the surface one, namely, the deep structure; the head verb of the embedded clause is subsequently raised to amalgamate with the causative sase by the time the surface structure is obtained (cf. Shibatani 1973). Essentially the same kind of analysis is assumed in more recent transformational analyses, in which the operation of head movement creates the morphological and phonological integrity in PF (phonetic form), while in LF (logical form) the biclausal structure is essentially retained.2 1
2
I owe Dan Flickinger, Kaz Fukushima, Sirai Hidetosi, Chris Manning, Naoko Maruyama, Yoshiko Matsumoto, Yuji Matsumoto, Peter Sells, Ivan Sag, Yoshiko Sheard, the graduate students for my class of 1993–1994 academic year, the participants of the UPSG Workshop in Tokyo, December 1995, and an anonymous reviewer for this volume for valuable comments on earlier versions of this paper, though the author alone is held responsible for any remaining errors. The research reported here has been supported in part by grants from the Ministry of Education, Science, and Culture. Or reinvented in the kind of analysis prevalent in the last decade in which LF is to be derived from an s-structure, cf. Kitagawa (1986) for his analysis of “affix raising” at LF.
120
In the (strict) lexicalist approach, in which no reorganization transformation is available, the above dilemma may lead to a major problem. There might appear to be no other choice than to assume a morphological and phonological integrity in the first place and explain everything based on such a configuration. Actually, one concrete proposal is made in the framework of HPSG (Pollard and Sag 1994): Iida, Manning, O’Neill, and Sag (1994) and Manning, Sag, and Iida (this volume) treat the concatenation of a verb stem and the causative as a single complex lexical item and no biclausal structure is assumed in the constituent structure.3 In this paper, I will try to propose a different kind of lexicalist approach also in the framework of HPSG and a related framework (Gunji 1995, Gunji and Harada 1998), in which a kind of “biclausal” structure (VP-embedding, to be exact) is retained in the constituent structure, while an essentially “monoclausal” morphological and phonological structure is also available. Specifically, I will explore the possibility of dual representation of Japanese causative constructions along the line of tecto/phenogrammatical distinction proposed by Dowty (1996) and argue that a “biclausal” nature shows up at the tectogrammatical level, while the phenogrammatical level is flatter and essentially a monoclausal structure.4 We first review available arguments for VP-embedding treatments of causatives, occasionally mentioning drawbacks of such an approach, and later propose a unified account subsuming the two apparently incompatible structures. In the latter half of the paper, I will propose a treatment of honorification in the current approach, as it also seems to motivate a “biclausal” approach for causatives but at the same time poses a major obstacle to the kind of VP-embedding approach proposed here.
2
Preliminaries
The feature structure we assume here is an extended version of the Japanese phrase structure grammar ( JPSG) system as exhibited in Gunji (1987, 1995), and Gunji and Harada (1998). In general, a linguistic expression corresponding to a sign in HPSG is expressed by the following feature representation, where represents morphological and phonological information, and syntactic and semantic (i.e., constituency) information. I will identify the former with the phenogrammatical information and the latter with tectogrammatical information. 3
4
This kind of analysis has also been proposed in the transformational lexicalist approach. See Miyagawa (1980). The “biclausal” structure is not necessarily a structure represented by a syntactic tree (i.e., a structure represented by the attribute in HPSG). It could well be a representation of a hierarchical structure in some part of syntax and semantics. What is essential here is that there can be some kind of hierarchical representation in some part of the syntactic and/or semantic representation to determine the semantic scope and other syntax/semantics related facts.
On lexicalist treatments of Japanese causatives
(1)
121
sign
morphon morph phon synsem local
core head content valence
nonlocal set(synsem)
As in HPSG, the head has subsorts depending on its part of speech, such as verb, noun, and postposition,5 each of which may have some of the head features, such as (grammatical relation) for postpositions, and for verbs.6 The valence may have further specifications such as the feature and the feature for some categories:7 (2)
valence list(synsem) list(synsem)
We usually abbreviate | simply as when there is no value. A new kind of local feature closely related to the valence features is the argument structure feature - (cf. Iida et al. 1994). It appears only in a 5
6
7
A postposition is a counterpart of English preposition. It follows, rather than precedes, a noun phrase and gives such information as grammatical relation, case, thematic role, etc. The feature is what bundles the feature and the feature. It allows you to pick up a feature structure larger than but smaller than , including semantic information but excluding valence information. The main argument in this paper, however, doesn’t depend on this specific feature geometry. Unlike my previous treatment in JPSG (Gunji 1987), I will assume that the subject and other complements in the list are ordered according to their obliqueness as in HPSG. As will be shown later, this ordering doesn’t affect the surface ordering and we still get the scrambling result. The difference between the feature and the feature is that the element in the list of the value is required to be morphophonologically adjacent to the head (cf. (42)). Another possible specification would be to have separate lists for the subject () and nonsubject complements (), as proposed in Borsley (1987) and ch. 9 of Pollard and Sag (1994): (i) list(synsem) list(synsem) Since we mark each complement by its grammatical relation (), we will not adopt this strategy in this paper.
122
lexical item and is canonically obtained by appending all the lists given as the value of the valence features in the order of increasing obliqueness.8 Thus, for a lexical head, we have: (3)
Lexical heads: 1 list(synsem) 2 list(synsem) - 1 % 2
where X%Y is a list obtained by appending the lists X and Y, that is, append(X,Y,X%Y).
In the following, since the - feature subsumes the distinction between , on one hand, and and , on the other, I will show only the - feature instead of the valence features, if both are available, unless the value of the valence feature itself is at issue. Also, the sortal specification will usually be omitted where there is no fear of confusion. I will also abbreviate | as in the following since nonlocal features are not at issue in this paper. A postposition like ga, as a subject marker, will have a feature structure like the following:9 (4)
〈ga〉 postposition sbj nom
i 〈〉 〈1NPi 〉 〈1〉
-
Based on such a feature specification for a postposition, a subject postpositional phrase, abbreviated as p[sbj]i , has a feature structure like the one shown below:10
8
9
10
If the valence consists of and (and ), the value of - is the append of these three lists in this order. Unlike the valence features, the value of the - feature is not inherited (by cancelling off the sister complements) to the mother phrase. Thus, it is a strictly lexical feature. In the following, the value of will be simply shown as a list of “phonemes.” A counterpart of “phoneme” is reconstructed in our approach from phonological elements and morae. See Gunji and Matui (1995), Gunji and Harada (1998) for details. Although I will assume that the subject and the object are postpositional phrases headed by the respective postpositions, they could as well be analyzed as noun phrases followed by case markers.
On lexicalist treatments of Japanese causatives
(5)
i 〈〉
123
postposition sbj nom
In this paper, a lexical item of the following form is assumed for sase :11 (6)
〈sase〉
verb cause i j l 〈1p[sbj]i , 2p[obj]j 〉 verb 3 l 〈p[sbj]j 〉 - 〈1, 2, 3〉
That is, sase is a kind of verb that is required to be adjacent to a verb phrase (or an intransitive verb). Moreover, it subcategorizes for (in addition to the adjacent verb phrase) a subject and an object, where the adjacent verb phrase subcategorizes for a subject that is semantically identical (denoted by the use of the same index j ) to the object of sase. In order to make the discussion in this paper more specific, I will assume that the following kind of lexical correspondence for causativized verb stems is specified in the “monoclausal” lexicalist approach.12
11
12
The causative morpheme sase phonologically has two forms: [sase] and [ase]. The former appears when it is adjacent to a VP whose head verb has a stem that ends with a vowel, while the latter is used when it is adjacent to a VP whose head verb has a stem that ends with a consonant. These alternative forms can be explained in the framework of a constraint-based phonology we are developing for Japanese (Gunji and Matui 1995, Gunji and Harada 1998). As for the value of the feature, I assume that the semantic objects of the sort content have subsorts, each of which corresponds to in HPSG. This alternative approach is briefly mentioned in Pollard and Sag (1994: ch. 8). I have reconstructed the following feature specification based on the feature structures proposed for causatives in various languages in Manning and Sag (1995) and Manning et al. (this volume). See the latter for an approach to derive this kind of structure from more general lexical information.
124
(7)
Complex lexical specification for causativization in “monoclausal” approach Given a verb stem stem α verb l - 〈p[sbj]j |β〉 the causative stem is: cause-stem 〈1sase〉 verb cause i j l - 〈p[sbj]i , p[obj]j |2〉 1α verb l - 〈p[sbj]j|2β〉
That is, for each verb there is a causative counterpart with the suffix -sase and whose argument structure list is changed so that the list has one additional member. Thus, the causative counterpart of an intransitive verb is a transitive verb and the causative counterpart of a transitive verb is a ditransitive verb.13 Let us illustrate the difference in feature representations for the two approaches. A transitive verb like mi “see” will have the following feature specification: (8)
stem
〈mi〉
13
verb see j k - 〈p[sbj]j , p[obj]k〉
In addition, part of the information of the preceding verb stem is retained in the complex causative verb under the innovative (apparently local) feature . As will be shown later, this addition is essential in the “monoclausal” approach to cope with some of the problems raised in the next section. Alternatively, Manning et al. (this volume) propose to have a nested - list. In this approach, the value of - will be 〈p[sbj]i ,p[obj]j ,〈PROj |2〉〉. This will also be crucial for some of the problems raised in the next section.
On lexicalist treatments of Japanese causatives
125
Then, the current approach with VP-embedding will assume the following feature structure for the verb complex mi-sase :14 (9)
phrase
〈X·wo, mi·sase〉
verb cause i j
see j k 〈p[sbj]i , p[obj]j〉
This feature structure corresponds to the top TVP node in the following constituent structure, in which two separate lists of - appear in separate lexical items, namely in mi and in sase : TVP[ 〈4, 5〉]
(10)
1VP[ 〈3〉]
〈1〉 V - 〈4p[sbj]i, 5p[obj]j , 1〉
2PPk
V[- 〈3p[sbj]j , 2〉]
X-wo
mi
sase
On the other hand, the “monoclausal” approach sketched above will assume a causative verb mi-sase as a ditransitive lexical item by the complex lexical specification for causatives shown in (7): (11)
word
〈mi-sase〉
verb cause i j 3 - 〈p[sbj]i, p[obj]j , 2〉 〈mi〉
verb see 3 j k - 〈p[sbj]j , 2 p[obj]k〉
14
Since sase must be adjacent to a verb phrase, and mi by itself is a transitive verb, what is shown in (9) is a feature structure corresponding to [ Xk-wo mi ]-sase, where Xk is a noun
126
This gives the following TVP structure for X-wo mi-sase : TVP[ 〈4, 5〉]
(12) 2PPk
- 〈4p[sbj]i, 5p[obj]j , 2〉 DTV [- 〈p[sbj]j , 2〉]
X-wo
mi-sase
Note that two separate lists of - are represented in a single (complex) lexical entry mi-sase. Thus, the crucial difference is whether the hierarchical structure in terms of - is represented distributively in a constituency structure like (10) or centrally inside of a single (complex) lexical item like (11). In this sense, the version of “monoclausal” analysis proposed by Iida et al. (1994) and Manning et al. (this volume) could in fact be considered a “biclausal” model expressed in a single lexical item by means of (cf. section 3.3 on reflexives and section 5 on honorification).
3
Semantic/syntactic integrity
As argued by Kitagawa (1986), Iida et al. (1994), and Manning et al. (this volume), among others, in detail, there is no doubt that a sequence like mi-sase is a phonological unit. It behaves just like a single word as far as phonology is concerned, favoring the “monoclausal” approach. In short, the kind of constituency shown in (10) is incompatible with these facts. This is a major obstacle in the kind of “biclausal” lexicalist approach assumed here, since we cannot assume any kind of reorganization of the structure to happen later. However, there are other aspects in such a sequence that are inconsistent if it is indeed a single word. In these aspects, the verb stem preceding the causative and its complement, if any, behave as a verb phrase and hence as a semantic (and syntactic) unit, favoring the “biclausal” approach. In this section, I will review some of the representative arguments for the semantic (and syntactic) integrity of the preceding verb phrase.
3.1
Pseudo-cleft
Consider the following causative sentence: (13) Ken-ga Naomi-ni TV-wo mi-sase-ta. Ken- Naomi- TV- see-- “Ken caused Naomi to watch TV.” phrase whose value is k. The value of is a complex list complying with the morphonological principle introduced later.
On lexicalist treatments of Japanese causatives
127
We have the following two counterparts for the above sentence: (14) a. Ken-ga Naomi-ni sita koto-wa TV-wo mi-sase-ru Ken- Naomi- did thing- TV- see-- koto-dat-ta. thing-- “(lit.) What Ken did to Naomi was to cause (her) to watch TV.” b. Ken-ga Naomi-ni s-ase-ta koto-wa TV-wo mi-ru Ken- Naomi- do-- thing- TV- see- koto-dat-ta. thing-- “(lit.) What Ken caused Naomi to do was to watch TV.” In the above sentences, the first type shows that TV-wo mi-sase is a unit, while the second shows that only TV-wo mi is a unit. Since both (14a) and (14b) should be related to (13) in some way (perhaps semantically), an analysis that allows the constituency of both TV-wo mi and TV-wo mi-sase, namely, some version of “biclausal” analysis (cf. the constituency shown in (10)), will have less difficulty in explaining these data. Note that the behavior of the causative mi-sase is different from a simple lexical ditransitive verb like mise “show,” whose lexical structure is shown below, in that there is no counterpart of the second type, as is shown in (16): (15) Simple lexical ditransitive verb mise “show”: word 〈mise〉 verb show i j k - 〈p[sbj]i, p[obj]j , p[obj]k〉 (16) a. Ken-ga Naomi-ni TV-wo mise-ta. Ken- Naomi- TV- show- “Ken showed Naomi a TV (set).” b. Ken-ga Naomi-ni sita koto-wa TV-wo mise-ru Ken- Naomi- did thing- TV- show- koto-dat-ta. thing-- “(lit.) What Ken did to Naomi was to show (her) a TV.” c. (no second counterpart for a simple ditransitive verb) That is, there can be no way of “to watch TV” being a unit. This is predicted since there is no syntactic constituent (either an embedded clause or VP)
128
corresponding to “to watch TV.” Thus, the difference between mi-sase and mise in this respect needs to be explained if a “monoclausal” approach is assumed. One might argue that a purely semantics-based account of this type of construction could be developed. Note that even if such an account turns out to be possible, it has to take care of its interaction with other syntax-related phenomena such as binding. For example, if you have zibun in (14b), it can be bound by both Ken and Naomi, a sign of biclausal structure (cf. section 3.3): (17) Ken-ga Naomi-ni s-ase-ta koto-wa zibun-no TV-wo Ken- Naomi- do-- thing- self- TV- mi-ru koto-dat-ta. see- thing-- “(lit.) What Ken caused Naomi to do was to watch his/her TV.”
3.2
Semantic scope
The following sentences have different interpretations depending on the location of na “” in the split quantificational expression sika . . . na “only . . . .” (18) a. Ken-ga kinoo-sika Naomi-ga ki-ta-to Ken- yesterday-only Naomi- come-- iw-anakat-ta. say-- “It was only yesterday that Ken said Naomi came.” b. Ken-ga kinoo-sika Naomi-ga ko-nakat-ta-to Ken- yesterday-only Naomi- come--- it-ta. say- “Ken said that it was only yesterday that Naomi came.” The former says that it is only yesterday that Ken reported that Naomi came; even though she may have come on the days earlier than yesterday, he didn’t report her coming on those days. On the other hand, the latter sentence says that, according to Ken’s report, it was only yesterday that Naomi came, and his report could be made today, not necessarily yesterday. The split quantificational phrase sika . . . na must be, in general, in the same clause, as the following unacceptable sentences show, where [ ] indicates a clause boundary:15 15
This fact has been pointed out in the early generative literature by McGloin (1976) and Muraki (1978), among others, as a constraint on derivation. See Sells (1994) for more discussion and literature. Kitagawa (1986) also discusses this construction in his argument for reconstruction of biclausal structure at LF for causatives.
On lexicalist treatments of Japanese causatives
129
(19) a. *Ken-ga [Naomi-ga Osaka-ni-sika ki-ta]-to Ken- Naomi- Osaka-to-only come-- iw-anakat-ta. say-- b. Ken-ga [Naomi-ga Osaka-ni-sika ko-nakat-ta]-to Ken- Naomi- Osaka-to-only come--- it-ta. say- “Ken said that Naomi came only to Osaka.” c. *Ken-ga kogoede-sika [Naomi-ga ko-nakat-ta]-to Ken- softly-only Naomi- come--- it-ta. say- d. Ken-ga kogoede-sika [Naomi-ga ki-ta]-to Ken- softly-only Naomi- come-- iw-anakat-ta. say-- “Ken said only softly that Naomi came.” Hence, in (18a), kinoo-sika is in the main clause, while it is inside of the embedded clause in (18b), as indicated by the following schemata: (20) a. Ken-ga kinoo-sika [Naomi-ga ki-ta]-to iw-anakat-ta. b. Ken-ga [kinoo-sika Naomi-ga ko-nakat-ta]-to it-ta. The generalization we get from these facts is that the scope of sika . . . na cannot be outside of the clause in which sika appears. Now, consider the following sentence, which has at most three interpretations: (21) Ken-ga Naomi-ni TV-sika mi-sase-nakat-ta. Ken- Naomi- TV-only see--- “It was only TV that Ken caused Naomi to watch.” (22) a. (Wide scope) Ken didn’t cause Naomi to watch other things. b. (Narrow scope) Ken didn’t cause Naomi to do other things. The wide-scope reading (22a) is obtained if sika . . . na has mi-sase “cause to see” within its scope. On the other hand, the narrow-scope reading (22b) can only be obtained if sika . . . na is assumed to have only mi in its scope. These contrasts can be explained if TV-sika mi-sase is considered to have a “biclausal” structure, and TV-sika can be either inside or outside of the embedded “clause.” Informally, the contrast can be schematically shown by the following, where [] indicates a VP boundary.16 16
Here “clause” is used in an extended sense to include VP, in the sense of “subclause” by Sells (1994). Note that, in either structure, na appears after sase, unlike the sentences in (18),
130
(23) a. (Wide scope) Ken-ga Naomi-ni TV-sika [mi]-sase-nakat-ta. b. (Narrow scope) Ken-ga Naomi-ni [TV-sika mi]-sase-nakat-ta. In the construction shown in (23a), TV-sika is outside of the inner “clause” headed by mi. This construction is compatible with the “monoclausal” approach in that mi-sase is a single unit. On the other hand, (23b) assumes that TV-sika is inside of the inner “clause,” in which TV-sika and the head verb mi constitute an embedded VP as the complement of the causative. In this case, sase attaches to the VP constituent; hence the causative is out of the scope of sika. This structure is what the “biclausal” approach assumes for constituency. Note that this kind of ambiguity does not arise for the lexical ditransitive verb mise “show,” for which there can be no “biclausal” structure. (24) Ken-ga Naomi-ni TV-sika mise-nakat-ta. Ken- Naomi- TV-only show-- “It was only TV that Ken showed Naomi (i.e., Ken didn’t show Naomi other things).” In short, the narrow-scope reading indicated in (22b) is unexpected for the “monoclausal” analysis, since if the verb complex mi-sase is a word, it should behave just like mise as far as the scope of sika is concerned. It is only if a VP constituency is assumed that sika can modify part of the verb-complex mi-sase. There is, however, another aspect of (21) that may count as a counterexample to the “biclausal” approach. Note that, in the construction shown in (23a), TV-sika, apparently the complement of mi, is outside of the VP headed by mi. One of the possible approaches for the “biclausal” analysis would be to assume that this construction is obtained by “preposing” the in which na can appear between ki/ko “come” and to “.” This is because, morphophonologically, sase must always be adjacent to the preceding verb stem and no word can intervene between them. I will discuss the related phenomenon of the prefix “preposing” in honorification in section 5.3. Note, incidentally, that, if a VP, rather than a full clause, is embedded, the negative can be attached to either the embedded verb or the matrix verb, if both are possible (cf. Muraki 1978 and Sells 1994): (i) a. Ken-ga Naomi-ni TV-sika mi-temoraw-anakat-ta. Ken- Naomi- TV-only see-ask-- “Ken only asked Naomi to watch TV.” b. Ken-ga Naomi-ni TV-sika mi-nai-demorat-ta. Ken- Naomi- TV-only see--ask- “Ken only asked Naomi to watch TV.” Muraki (1978) treats this kind of construction by a restructuring transformation. See Sells (1994) for arguments against such a treatment. See also Kuno (1988) for his argument for raising the sika phrase to the main clause. I will assume that the negative na can be “postposed” by a morphophonological process similar to the one that in effect “preposes” the honorification prefix o- (cf. section 5.3).
On lexicalist treatments of Japanese causatives
131
relevant constituent (presumably by a kind of topicalization or “raising”).17 Note the following sentences where a quantificational phrase like TV-sika is “preposed”: (25) a. *Osaka-ni-sika Ken-ga [Naomi-ga ko-nakat-ta]-to Osaka-to-only Ken- Naomi- come--- it-ta. say- b. ?Osaka-ni-sika Ken-ga [Naomi-ga ki-ta]-to Osaka-to-only Ken- Naomi- come-- iw-anakat-ta. say-- “Ken said that Naomi came only to Osaka.” c. Osaka-sika Ken-ga [Naomi-ga ki-ta]-to Osaka-only Ken- Naomi- come-- iw-anakat-ta. say-- “Ken said that Naomi came only to Osaka.” d. Ken-ga [Osaka-ni-sika Naomi-ga ko-nakat-ta]-to Ken- Osaka-to-only Naomi- come--- it-ta. say- “Ken said that Naomi came only to Osaka.” e. Kogoede-sika Ken-ga [Naomi-ga ki-ta]-to softly-only Ken- Naomi- come-- iw-anakat-ta. say-- “Ken said only softly that Naomi came.” Note that (25a) is unacceptable as Osaka-ni-sika is made to be a constituent of the main clause, even though na is in the embedded clause. On the other hand, we have a slightly better acceptability for (25b), in which Osaka-ni-sika is preposed and na attaches to the main verb. The oddness of this sentence comes from the fact that Osaka-ni, with the locative marker, has to somehow modify the main verb if simply preposed. In (25c), Osaka-sika is topicalized, as the dropping of the locative indicates. Since it is simply a topic and can have any suitable relationship with the rest of the sentence, the acceptability of the sentence improves. (25d) is a case where Osaka-ni-sika is at the initial position of the embedded clause. Since it is still in the embedded clause, the sentence is acceptable. (25e) has the same interpretation as (19d) since in both 17
Of course, the terms “preposing” and “raising” are used only figuratively here. As for a counterpart of “raising,” a more concrete proposal is made later by the name of argument attraction (cf. (52)).
132
sentences kogoede-sika is in the main clause. Thus, we can say in general that if a sika phrase is “preposed,” the scope of sika becomes wide. Now observe the following sentences: (26) a. Ken-ga TV-sika Naomi-ni [mi]-sase-nakat-ta. Ken- TV-only Naomi- see--- “It was only TV that Ken caused Naomi to watch; he didn’t cause her to watch other things.” b. TV-sika Ken-ga Naomi-ni [mi]-sase-nakat-ta. TV-only Ken- Naomi- see--- “It was only TV that Ken caused Naomi to watch; he didn’t cause her to watch other things.” In (26a) and (26b), the sika phrase is “preposed.” Then, the scope of sika becomes the main clause and these sentences have no ambiguity; they only have the wide-scope reading. By the same token, the structure of (21) with the interpretation (22a) can be considered to involve “preposing” and taken out of the embedded “clause” (even though the resulting string is exactly the same as the one where no “preposing” occurs). In this sense, an analysis that assumes VP embedding is consistent with the ambiguity exhibited above.18 On the contrary, a construction like (23b) could never be obtained for the “monoclausal” approach since decomposition of a word at the level of syntax is not allowed. The “monoclausal” lexicalist approach proposed by Iida et al. (1994) and Manning et al. (this volume) is extended so that even in a construction like (23a), sika can have access to part of the verb complex mi-sase by way of a typeraising lexical rule. In this kind of analysis an adverbial is freely incorporated into the (and hence -) list, giving rise to a new lexical item that seeks an adverbial. Thus, from mi shown in (8), you get a variant of mi that seeks an adverbial: 18
One might be tempted to claim that the ambiguity shown in sentences like (21) is due to the lexical ambiguity of sase; between the meaning of “force” and that of “permit.” For example, one could argue that (22a) is associated with the “force” reading and (22b) with the “permit” reading, or the other way around. This argument fails because the ambiguity of (21) can still be obtained in the same interpretation of sase; the results of substituting let Naomi for cause Naomi to in both (22a) and (22b) are still possible readings of (21), and similarly for the substitution of made Naomi for cause Naomi to. Moreover, lexically nonambiguous passive rare shows exactly the same kind of ambiguity: (i) a. Naomi-ga Ken-ni te-sika sawar-are-nakat-ta. Naomi- Ken- hand-only touch--- “It was only her hand that Naomi had Ken touch.” b. She didn’t have him touch other things. c. She didn’t have him do other things. Thus, the ambiguity is more of a structural nature than a lexical nature.
On lexicalist treatments of Japanese causatives
(27)
〈mi〉
133
verb adv′
see j k - 〈p[sbj]j , p[obj]k, adv〉
where adv′ is a sort corresponding to the semantics of mi as modified by the sort adverbial. If this lexical item is fed as the stem for the causative, the value of for the causative form mi-sase looks like the following: (28)
verb cause i j 1 - 〈p[sbj]i, p[obj]j , 2, 3〉 〈mi 〉
verb adv′
see j k - 〈p[sbj]j , 2 p[obj]k , 3 adv〉 1
This mi-sase will take an adverbial as one of the complements. Note that the semantic scope of the adverbial is fixed by the type-raising lexical rule; it takes the narrow scope. If, on the other hand, the type-raising lexical rule applies to mi-sase in the form of (11), the following structure is obtained:
(29)
verb adv′
cause i j 1 - 〈p[sbj]i, p[obj]j , 2, adv〉 〈mi 〉
verb see 1 j k - 〈p[sbj]j , 2 p[obj]k 〉
134
These two lexical items are claimed to be responsible for the two readings of (21).
3.3
Reflexives
The antecedent of the reflexive zibun “self ” is typically the subject. Thus, for sentences with ordinary (di)transitive verbs, the object cannot be the antecedent: (30) Ken-ga Naomi-ni zibun-no TV-wo mise-ta. Ken- Naomi- self- TV- show- “Ken showed Naomi his/*her TV.” When a causative sase is used, zibun is known to be ambiguous in spite of the fact that there is only one (surface) subject:19 (31) Ken-ga Naomi-ni zibun-no TV-wo mi-sase-ta. Ken- Naomi- self- TV- see-- “Ken made Naomi see his/her TV.” As far as syntactic binding is concerned, the least oblique element in an - list (i.e., the subject) can be the antecedent of zibun (cf. Gunji 1987, Iida 1992, among others). Then, an account that provides two sources for the - lists will be able to account for the ambiguity. As is shown in (10), either the object coindexed with the least oblique element in the “upper” - (i, namely the matrix subject) or the one coindexed with the least oblique element in the “lower” - ( j, namely the causee) can be the antecedent of zibun. Thus, the “biclausal” approach can correctly predict the ambiguity of zibun in causatives. On the other hand, the “monoclausal” lexicalist approach as advocated by Iida et al. (1994) will have to allow Binding Principle A to have access to the - list inside of the as shown in (12) in order to be able to predict the ambiguity of the reflexive. That is, it has to stipulate that either the object coindexed with the least oblique element in the “upper” - (i, namely the matrix subject) or the one coindexed with the least oblique element in the “lower” (i.e., inside of ) - ( j, namely the causee) can be the antecedent of zibun.20 In this sense, even though the version of the “monoclausal” lexicalist approach as shown in (12) can make a distinction between 19
20
Recently, there are analyses of the Japanese reflexive that argue that the antecedent of zibun is not (restricted to) a grammatical subject, but is determined from factors that are of more semantic and pragmatic nature (cf. Iida 1992). Such an analysis allows an object to be an antecedent of the reflexive in some cases. However, this type of analysis will still have to explain why (30) is not ambiguous, on one hand, and exactly by what semantic and pragmatic factors the object in (31) (i.e., the causee) can be the antecedent of zibun. Note that in both (30) and (31), Naomi has similar semantic and pragmatic roles (the person who is shown and sees a TV). Manning et al. (this volume) propose to have nested argument structures for complex predicates (cf. note 12). This is another way to provide two sources for -.
On lexicalist treatments of Japanese causatives
135
a lexical ditransitive verb (semantically causative) and a verb complex affixed by sase in terms of -, it can do that at the expense of “reconstructing” a “biclausal” structure inside of a single lexical item.21
3.4
Pronouns
Even though the status of Japanese “pronouns,” particularly overt ones like kare “he” or kanozyo “she,” is not clear in terms of the disjoint reference condition (the Binding Principle B: a pronominal complement cannot be coindexed with a less oblique complement in the same - list (Pollard and Sag 1994)), in some cases, they obey Principle B and exhibit sharp contrast as compared with an anaphora (reflexive zibun). (32) a. Ken-ga kare-wo mi-ta. Ken- he- see- “Ken saw him (≠ Ken).” b. Ken-ga ø mi-ta. Ken- zero pro. see- “Ken saw him/her/it (≠ Ken).” c. Ken-ga zibun-wo mi-ta. Ken- self- see- “Ken saw himself (= Ken).” Now, observe the following sentence with a sase-attached complex verb and the one with a genuine ditransitive verb: (33) a. Ken-ga Naomi-ni kare-wo/ø mi-sase-ta. Ken- Naomi- he-/zero pro. see-- “Ken made Naomi see him.” b. Ken-ga Naomi-ni kare-wo/ø mise-ta. Ken- Naomi- he-/zero pro. show- “Ken showed him (≠ Ken) to Naomi.” 21
Some people argue that zibun is not an anaphora, since it shows long-distance dependency, unlike the reflexive in European languages, in apparent violation of Principle A of the Binding theory. Instead, they claim that zibun zisin is a true anaphora whose antecedent is clause-bound. Thus, the following is claimed to be unambiguous (Naomi being the only antecedent): (i) Ken-ga Naomi-ni zibun zisin-no TV-wo mi-sase-ta. Ken- Naomi- self- TV- see-- “Ken made Naomi see self ’s TV.” However, to many speakers of Japanese, including the author, zibun zisin is simply an emphatic form and shows no distinct property in terms of binding. Moreover, the emphatic connotation of zibun zisin sometimes makes the sentence stylistically less acceptable. Thus, you cannot simply substitute zibun zisin for zibun for every case. In fact, the above sentence is less natural as compared with (31). At any rate, what is at issue is not whether zibun is an anaphor obeying the usual interpretation of Principle A, but whether its antecedent can correctly be determined based on the concept of obliqueness.
136
Unlike the cases with sentences involving ditransitive verbs, a pronominal can be bound by the subject in (33a). If Principle B of binding theory is applicable to Japanese overt and covert pronominals, then, this suggests that the subject and the pronominal cannot be in the same - list, since, if they were, the subject would become less oblique than the pronominal and could not bind it. For the “biclausal” approach, there is no violation of Principle B, since the pronominal is subcategorized for only by mi “see” and the - list for sase contains only the subject (Ken in (33a)) and the object (Naomi ), besides the VP. Ken and kare are never in the same - list. Note, unlike the case for the reflexive, for a “monoclausal” approach, simply realizing dual - lists in a structure like (12) cannot solve this problem, as both the pronominal and the subject are in the “upper” - list. Thus, Iida et al. (1994) treat this apparent violation of Principle B by allowing existential interpretation of Principle B: a pronominal is only required to be a-free (not bound by a less oblique coargument) in some of the - lists, if there are more than one. According to them, since the verb complex mi-sase has two - lists, one for the inner verb mi and the other for the entire verb complex, and the accusative object is free in the - list of the inner verb in (33a), there is no Principle B violation.22
3.5
Coordination
Observe the following: (34) Ken-ga Naomi-ni TV-wo mi sosite e-wo kak-ase-ta. Ken- Naomi- TV- see and picture- draw-- “Ken made Naomi watch TV and draw a picture.” Note that, in the above sentence, the VP TV-wo mi and e-wo kak are presumably conjoined and sase is attached to this complex VP; it is not the case that only the second conjunct is to be understood as causativized. That is, at least semantically, it is a conjunction of two causatives: “cause to see” and “cause to write.” Arguably, this structure could be analyzed as a case of adjunction and not a case of coordination. If this line of analysis turns out to be correct, then such an analysis will have to explain why TV-wo mi, as a modifier meaning something like “while watching TV,” only modifies e-wo kak, not e-wo kak-ase, 22
A reviewer for this volume suggests another possible treatment for a “monoclausal” approach by allowing the inheritance of the object of the - list in the stem to the “upper” - list to be optional, though the content of the list is made to be obligatorily inherited. In fact, such decoupling of and - are suggested and studied in a variety of languages by Manning and Sag (1995). Based on this idea, Manning et al. (this volume) propose an alternative approach in terms of nested - lists. As is shown in the - list in note 13, if a-freeness is determined among the list members at the same level, the subject in the outer list cannot bind the object in the inner list. I will leave open the adequacy of such treatments.
On lexicalist treatments of Japanese causatives
137
since “while watching TV” modifies the way in which Naomi draws a picture, not the way in which Ken makes Naomi draw a picture. Note that the kind of type-raising lexical rule that gives you (27) will only solve half of the problem; even though it will give the interpretation in which TV-wo mi modifies e-wo kak, it will not necessarily exclude the interpretation in which the former modifies e-wo kak-ase. Remember that type raising necessarily predicts the ambiguity of (21); it doesn’t give only the narrow-scope reading.
3.6
Idiom chunk
Observe the following pair of sentences: (35) a. Naomi-ga Ken-ni te-wo yai-tei-ru. Naomi- Ken- hand- burn-- “Naomi is troubled with Ken.” b. Ken-ga Naomi-ni te-wo yak-ase-tei-ru. Ken- Naomi- hand- burn--- “Ken is causing trouble to Naomi.” The idiomatic expression te-wo yak “be troubled” can be made to be a causative form just like other intransitive verbs, and the idiomatic meaning is preserved. In the “biclausal” approach, the reason why the idiomatic meaning of te-wo yak is preserved in (35b) is straightforwardly explained in a compositional semantics, since it remains a constituent. In a “monoclausal” approach, however, reconstructing the idiomatic meaning from te and yak-ase compositionally would not be so straightforward. At least, such an analysis would have to duplicate the construction of the idiomatic meaning from te and yak for te and yak-ase. If the form like yak-ase is derived from yak by a lexical rule, or any other general mechanism, the idiomatic meaning of yak when combined with te will be able to be “encoded” in the yak-ase also. Thus, what is at issue here is not whether or not a compositional semantics is possible for causatives of idioms, but where to represent idiomatic meanings: whether they are associated with only one part of the lexicon and the causative meaning is computed compositionally or they are associated with several mutually related forms in the lexicon (even if the association could be made “on-line” by a lexical rule or a hierarchically organized lexicon). Note this kind of idiom where an accusative object is followed by a transitive verb is abundant: mimi-wo sumas “clear ear, i.e., listen carefully,” me-wo kake “send eye, i.e., look after,” hozo-wo kam “bite navel, i.e., regret,” etc.
3.7
Case marking
It has been pointed out that the so-called “double wo constraint” applies to causatives and the causee cannot be marked as accusative if there is another accusative object (Kuroda 1965, Harada 1973, Shibatani 1973). Thus, even
138
though (13), repeated below, is acceptable, the sentence with the accusativemarked Naomi is unacceptable: (13) Ken-ga Naomi-ni TV-wo mi-sase-ta. Ken- Naomi- TV- see-- “Ken caused Naomi to watch TV.” (36) *Ken-ga Naomi-wo TV-wo mi-sase-ta. Ken- Naomi- TV- see-- (36) is expected to be acceptable in a “biclausal” analysis since sase itself allows accusative marking for the causee if there is no other accusative object: (37) Ken-ga Naomi-wo aruk-ase-ta. Ken- Naomi- walk-- “Ken caused Naomi to walk.” Thus, it is only when the embedded VP in a “biclausal” analysis contains its own accusative object that the causee is obligatorily marked dative. Thus, it seems that there has to be an arbitrary stipulation. As argued by Iida et al. (1994) and Manning et al. (this volume), this might be considered to be merely a reflection of the canonical case marking pattern of ditransitive verbs, as Japanese ditransitive verbs all take one dative object and one accusative object. However, there are reasons to doubt the generality of such an argument, since, for one thing, there is no “double ni constraint” for causatives in general:23 (38) Ken-ga Naomi-ni situmon-ni kotae-sase-ta. Ken- Naomi- question- answer-- “Ken caused Naomi to answer questions.” Since the above construction is not observed in genuine ditransitive verbs, the validity of the explanation of the “double wo constraint” based on the nonexistence of ditransitive verbs with double accusatives is questionable. Moreover, the “monoclausal” approach creates an entirely new kind of nonexistent verb form in terms of case marking. For example, in addition to the double dative “ditransitive” above, it would also create a “tritransitive” verb with two datives and one accusative, if the following is analyzed in a “monoclausal” approach: 23
Some people find some cases of double ni causatives marginal. For example, the following sounds somewhat worse than (38): (i) ?Ken-ga Naomi-ni Marie-ni aw-ase-ta. Ken- Naomi- Marie- meet-- “Ken caused Naomi to meet Marie.” However, the above is still better than (36). Thus, I will assume that a putative “double ni constraint,” if any, is much weaker than the “double wo constraint” at best.
On lexicalist treatments of Japanese causatives
139
(39) Ken-ga Naomi-ni Marie-ni hon-wo okur-ase-ta. Ken- Naomi- Marie- book- send-- “Ken caused Naomi to send a book to Marie.” As far as the author knows, there is no tritransitive verb in Japanese. If the causative is allowed to create an entirely new case grid in a “monoclausal” analysis, it would have to explain why only the double accusative “ditransitive” is prohibited. Thus, even though the exact status of the “double wo constraint” is not clear, it is not a simple constraint that can be attributed to the canonical case grid of existing lexical verbs.
3.8
Passive
It has been known that both objects of a lexical ditransitive verb can become the subject in a passivized sentence, though the case in which the accusative object is made to be the subject may be acceptable only when the agent is marked by niyotte, not the usual ni. (40) a. Naomi-ga Ken-ni(yotte) TV-wo mise-rare-ta. Naomi- Ken-by TV- show-- “Naomi was shown a TV by Ken.” b. TV-ga Ken-niyotte/?ni Naomi-ni mise-rare-ta. TV- Ken-by Naomi- show-- “A TV was shown to Naomi by Ken.” However, only one of the objects can be made to be the subject for mi-sase (Inoue 1976). Even the use of niyotte doesn’t improve the acceptability. (41) a. Naomi-ga Ken-ni(yotte) TV-wo mi-sase-rare-ta. Naomi- Ken-by TV- see--- “Naomi was shown a TV by Ken.” b. *TV-ga Ken-niyotte/ni Naomi-ni mi-sase-rare-ta. TV- Ken-by Naomi- see---
4
Constraints on morphophonology
4.1
Morphological principles
So far, we have seen arguments both for and against the “biclausal” approach. One thing we should notice here is the following correlation: the “monoclausal” approach is consistent with morphological and phonological phenomena, while the “biclausal” approach is more appropriate for explaining syntactic and semantic phenomena. Now, if we assume that a morphophonological structure is not necessarily isomorphic to the constituent structure, we can have a counterpart of VP-embedding in constituent structures represented by such
140
syntactic and semantic features as - and (that is, inside of the feature) and at the same time we can have a consistent morphophonological representation as the value of . Specifically, let us assume that the constituent structure itself doesn’t determine the linear order of morphemes, unlike the common assumption in generative grammar, where morphophonological sequence is obtained by traversing the leaves of the tree representing the constituent structure.24 We see what consequences result from this assumption below. First, we posit the following principle concerning the feature.25 (42) Adjacent Feature Principle a. The feature of a phrase is empty. b. In a complement-head structure, the feature of the (lexical) head, if nonempty, is a singleton list consisting of a feature structure that is identical to the value of the complement. This is one way of saying that a lexical head with a nonempty value is a bound morpheme and cannot be free. With the Adjacent Feature Principle stated in this way, the morphophonological shape of a Japanese phrase is determined by the following principle. (43) Morphonological Principle (preliminary version) In a headed structure of the form: [ 1 % 〈5〉] [
3 % 〈7〉]
head
2 % 〈6〉 4
a. if 4 = 〈 〉, union(3 % 〈7〉,2,1), 5 = 6. (The last of the phrase is the last of the head. All the other s are obtained by union.) b. otherwise, union(3,2,1), 5 = 7·6. (The last of the phrase is the last of the adjacent dependent followed by the last of the head. All the other are obtained by union.) 24
25
Dowty (1996), Reape (1996), and Kathol (1995), among others, propose a similar approach for German and other languages. In such a treatment, the terminal yield of a phrase structure tree is no longer assumed to define the phonological shape of the sentence the tree is to represent. Except for the requirement of adjacency, this principle is essentially the same principle as the Subcategorization Principle in HPSG. The first requirement is a technical way of imposing adjacency; since a phrase must have an empty adjacent feature, if its head has a nonempty feature, it is required to have a complement matching the category in the value of .
On lexicalist treatments of Japanese causatives
141
The union relation is a sequence union in the sense of Reape (1996) and defined in the following way: (44) a. union(〈 〉,〈 〉,〈 〉). b. union(〈A|X〉,Y,〈A|Z〉) if union(X,Y,Z). c. union(X,〈A|Y〉,〈A|Z〉) if union(X,Y,Z). That is, Z is a list obtained by merging X and Y with the condition that the relative order of elements in X and Y is preserved in Z. In short, (43a) only requires that the last element of the list of the phrase (5) be identical to the last element in the list of the head (6). All the other elements can appear anywhere (modulo the restriction required by the union relation). That is, the complements (but perhaps not adjuncts) are scramblable. Since the union relation is not one-to-one, the value of the phrase is not necessarily uniquely determined.
4.2
The status of adjuncts
Whether adjuncts are allowed to scramble or not is disputable. Scrambling is usually taken to be a phenomenon that doesn’t affect semantics. So, if an adjunct is scramblable, it is not supposed to have a different scope. In this connection, note the following Japanese counterparts of (253) in Pollard and Sag (1987): (45) a. Ken-ga itiniti-ni ni-kai iyaiya zyogingu-wo suru. (koto) Ken- day- 2- reluctantly jogging- do the fact “(The fact that) Ken jogs reluctantly twice a day.” b. Ken-ga iyaiya itiniti-ni ni-kai zyogingu-wo suru. (koto) Ken- reluctantly day- 2- jogging- do the fact “(The fact that) Ken jogs twice a day reluctantly.” Here, just like the original English sentences, we can observe orderdependency in the scopes of adjuncts. Moreover, the scrambling of an adjunct inside of a complement with other complements, in contrast to the scrambling of a complement inside of another complement with other complements, seems to be impossible, or such scrambling results in a very awkward sentence. Consider the following: (46) a. Ken-ga HPSG-no hon-wo yon-da. Ken- HPSG- book- read- “Ken read a book on HPSG.” b. *HPSG-no Ken-ga hon-wo yon-da. HPSG- Ken- book- read-
142
Thus, adjuncts seem to be more resistant to scrambling. One way to constrain the scrambling of adjuncts would be to make a separate statement of the morphonological principle for the adjunct-head structure. The list of the phrase in an adjunct-head structure is simply made to be the appended list of those of the adjunct and the head. Thus, the union(3, 2, 1) in (43) would apply only to complement-head structures and we would instead have append(3, 2, 1) for adjunct-head structures. Another possibility is to prohibit scrambling of anything from a saturated category, by making the constraints on morphophonology sensitive to the value of of the dependent. This approach will have to be adopted anyway if a provision is needed for restricting scrambling to a clause-bound phenomenon. (43b) says that the last element of the list of the phrase is obtained by “attaching” the last element of the list of the adjacent complement to the last element of the list of the head.26 In other words, the head is morphophonologically adjacent only to the last element of the list of the adjacent complement. Note that in (43b), the head is always a lexical head, since the value of a phrasal head must be empty by virtue of the Adjacent Feature Principle.
4.3
Scrambling
Now, let us see how scrambling is realized in this approach. For example, a sentence like Naomi-ga TV-wo mi “Naomi watches TV” will have the following constituent structure and value for each of its constituents: (47) a.
S PP
26
VP
NP
P
Naomi
ga
PP
V
NP
P
TV
wo
mi
Here “attaching” is used in the sense of Dowty (1996). In terms of the data type, attachment can simply be realized by enclosing the relevant elements by an additional pair of brackets. Thus, a list like 〈 . . . ,c·d, . . . 〉 is actually a shorthand notation for a nested list: 〈 . . . ,〈c,d〉, . . . 〉. I will use the former notation for the sake of readability. In general, putting an additional pair of brackets around a sequence of elements in the list in effect “freezes” the sequence, in the sense that no element inside the sequence is not subject to the scrambling effect due to the union relation.
On lexicalist treatments of Japanese causatives
b.
constituent
143
possible values
[NP TV]
〈TV 〉
[P wo]
〈wo〉
[PP TV wo]
〈TV·wo〉
[V mi]
〈mi〉
[VP[PP TV wo] mi]
〈TV·wo, mi 〉
[PP Naomi ga]
〈Naomi·ga〉
[S[PP Naomi ga] [VP [PPTV wo] mi]]
〈Naomi·ga, TV·wo, mi 〉 〈TV·wo, Naomi·ga, mi 〉
As shown before, the postpositions are required to be to the preceding noun phrase (cf. (4)). Thus, TV “attaches” to wo and the sequence TV wo is “frozen” by virtue of (43b). On the other hand, the verb is not required to be adjacent to its complements (postpositional phrases), and hence the sequence TV wo mi corresponds to a list of two elements subject to scrambling, namely, TV·wo and mi. Due to the provision achieved by the union relation in (43), the nonfinal members of the lists in the dependent and the head are merged and scrambled, which results in the free order among complements of a head within a sentence. In short, scrambling is taken to be a phenomenon that only affects the morphological (and hence phonological) shape, and the structure represented inside of the feature is not affected. The formalization in (43) actually has no restriction on the scope of scrambling. Any member in the list on the head can scramble, allowing nonclause-mates to scramble with respect to one another. Such a case of “scrambling” seems indeed (marginally) possible: (48) ?Kono hon-wo Naomi-ga Ken-ga hometeita-to itta. this book- Naomi- Ken- praised- said “Naomi said that Ken praised this book.” There are, however, examples and arguments against scrambling across the sentence boundary. Note the following topicalized sentence, in which the topic marker wa replaces the accusative marker wo, sounds somewhat better: (49) Kono hon-wa Naomi-ga Ken-ga hometeita-to itta. this book- Naomi- Ken- praised- said “This book, Naomi said that Ken praised.” If the difference in acceptability between the above two sentences is significant and scrambling is indeed clause-bound, (43) has to be complicated so that the value of the mother will be “frozen” when it is a clause, or more generally, when it is a saturated phrase, as mentioned before. That is, when
144
the phrase is saturated (i.e., having an empty value), the value of the phrase is 〈1 % 〈5〉〉, rather than 1 % 〈5〉.
4.4
Causativization and scrambling
For a causative sentence, since sase is adjacent to the preceding verb phrase, the sequence consisting of the stem of the head verb of the VP and sase will be “frozen.” All the other complements (the subject and the objects, including the one in the embedded verb phrase) are free to scramble. For example, sentence (13): Ken-ga Naomi-ni TV-wo mi-sase will have the following values for its constituents: (50) a.
S PP
b.
VP
NP
P
Ken
ga
PP
TVP
NP
P
Naomi
ni
constituent
VP PP
V V
NP
P
TV
wo
sase
mi
possible values
[PP TV wo]
〈TV·wo〉
[VP[PP TV wo] mi]
〈TV·wo, mi 〉
[TVP[VP[PP TV wo] mi] sase]
〈TV·wo, mi·sase〉
[VP[PP Naomi ni] [TVP[VP[PP TV wo] mi] sase]]
〈Naomi·ni, TV·wo, mi·sase〉 〈TV·wo, Naomi·ni, mi·sase〉
〈Ken·ga, Naomi·ni, TV·wo, mi·sase〉 [S[PP Ken ga] 〈Ken·ga, TV·wo, Naomi·ni, mi·sase〉 [VP[PP Naomi ni] [TVP[VP[PP TV wo] mi] sase]]] 〈Naomi·ni, Ken·ga, TV·wo, mi·sase〉 〈Naomi·ni, TV·wo, Ken·ga, mi·sase〉 〈TV·wo, Ken·ga, Naomi·ni, mi·sase〉 〈TV·wo, Naomi·ni, Ken·ga, mi·sase〉
On lexicalist treatments of Japanese causatives
145
Note that 〈TV·wo·mi, sase〉 is not a possible value for the feature for TV wo mi sase, by virtue of (43b). That is, even though TV wo mi is a constituent (inside of the value of ), TV wo and mi belong to separate morphophonological units and the latter is actually a member of a morphophonological unit containing sase. In other words, TV wo mi is a constituent at the tectogrammatical level, where syntactic and semantic properties are defined, while mi sase is a morphophonological unit at the phenogrammatical level, where linear order and other phonological and morphological properties are defined. Many of the arguments that seem to favor the “monoclausal” approach are based on phonological phenomena and can be recast in terms of the list here. On the other hand, arguments favoring the “biclausal” analysis are mostly of syntactic and semantic nature and are represented in our approach in the feature structure inside of the feature, where the VP embedding structure is essentially retained.
4.5
Argument attraction
We have seen that scrambling only affects the morphophonological shape, and the structure represented inside of the feature is not affected. There is, however, some evidence for the possibility that the value of - is affected in some cases of scrambling. Let us reconsider the binding phenomenon mentioned in connection with (33).27 Observe that the contrast shown in (33) disappears when the object is “preposed”: (51) a. Kare-wo Ken-ga Naomi-ni mi-sase-ta. he- Ken- Naomi- see-- “Ken made Naomi see him (≠ Ken).” b. Kare-wo Ken-ga Naomi-ni mise-ta. he- Ken- Naomi- show- “Ken showed him (≠ Ken) to Naomi.” The change in interpretation of kare in (33a) and (51a) indicates that kare-wo and Ken-ga in (51a) belong to the same - list. This is unexpected for the current approach, since as sketched in (47) the scrambled sentences have the same constituent structure and hence the head verb has the same - value. We need some provision to allow (51a) to have a different - list from (33a). Let us assume here the following kind of “argument attraction” lexical rule, which has often been proposed for various languages (e.g. Hinrichs and Nakazawa 1994 for German): 27
I owe Kaz Fukushima (p.c., 1992) for bringing to notice this fact.
146
(52) Argument Attraction Lexical Rule if | verb is a lexical entry, then so is | verb
α 〈v[ β]〉 α %2 〈v[ β % 2]〉
Thus, for example, each verb that is adjacent to a VP is assumed to have its counterpart that is adjacent to a TVP and subcategorizes for an additional object, which is structure-shared with the object of the adjacent TVP. With this lexical rule, we are entitled to have a variant of sase that is adjacent to a TVP: (53)
〈sase〉
verb cause i j l 〈1p[sbj]i, 2 p[obj]j , 3〉 verb l 4 〈p[sbj]j , 3 p[obj]k 〉 - 〈1, 2, 3, 4〉 where j and k have some role in l Note that this allows scrambling of the second object with other complements, since this sase subcategorizes for three complements and acts just like a ditransitive verb. In this approach, mi-sase will become a constituent and have the following feature structure. This argument attraction lexical rule, in fact, gives essentially the same feature structure for mi-sase as the one assumed by the “monoclausal” lexicalist approach, namely, (11). (54)
phrase 〈mi·sase〉
verb cause i j
see j k 〈p[sbj]i, p[obj]j , p[obj]k〉
On lexicalist treatments of Japanese causatives
147
However, there are some differences in the “monoclausal” lexicalist approach and the “biclausal” lexicalist approach with the argument attraction lexical rule. First, while mi-sase in the “monoclausal” lexicalist approach (11) is a word, mi-sase in the “biclausal” lexicalist approach (54) is a phrase consisting of two words, mi and sase. Hence, there is no - list in (54) and only the - list of sase as in (53) is the locus of the Binding Principle B. Furthermore, in the “biclausal” lexicalist approach, the application of the lexical rule is not obligatory to get causative sentences, while the complex lexical specification for causatives in the “monoclausal” lexicalist approach, as in (7), must be obligatory in order to get causative sentences. Thus, in the “biclausal” lexicalist approach, unless alternative interpretation for the binding is concerned, there is no need to invoke the argument attraction lexical rule. Thus, even though there can be two different analyses for a sentence like (13) or (33a), one with (6) and the other with (53), such overgeneration is not inevitable. We could assume a general restriction that imposes some kind of “cost” (penalty) for such lexical rules as argument attraction; they are called for as a last resort, e.g., when alternative interpretation is required.
5
Honorification
If causativization cooccurs with honorification, several interesting interactions can be observed. In this section we will see additional arguments for a “biclausal” analysis based on the data involving honorification. In the latter half of the section, the morphophonological principle shown in (43) will be modified to eliminate an apparent drawback in the “biclausal” approach.
5.1
Subject honorification
When the subject is to be honored, the verb is followed by the morpheme ninar.28 (55) Suzuki-sensei-ga o-aruki-ninat-ta. Suzuki-teacher- walk-- “Prof. Suzuki (honored) walked.” (56) Suzuki-sensei-ga Naomi-no sakuhin-wo goran-ninat-ta. Suzuki-teacher- Naomi- work- see-- “Prof. Suzuki (honored) saw Naomi’s work.” The subject-honorified form can be followed by the causative, in which case it is the object, not the subject, that is honored. 28
In addition to the following ninar, the verb stem is preceded by o- or go-. As argued by Iida et al. (1994) and Manning et al. (this volume), the fact that the prefix o- or go- intervenes inside the embedded VP (cf. (56) and (58) below) may count as a potential problem for the “biclausal” approach assumed here. However, proper placement of the prefix o- inside the VP is achieved by slightly complicating the Morphonological Principle. We will come back to this problem later in section 5.3.
148
(57) a. Ken-ga Suzuki-sensei-ni Naomi-no sakuhin-wo Ken- Suzuki-teacher- Naomi- work- goran-ninar-ase-ta. see--- “Ken made Prof. Suzuki (honored) see Naomi’s work.” b. *Suzuki-sensei-ga Ken-ni Naomi-no sakuhin-wo Suzuki-teacher- Ken- Naomi- work- goran-ninar-ase-ta. see--- Let us assume, for the sake of concreteness, that a “monoclausal” lexicalist approach derives honorific forms by a complex lexical specification like (7). That is, just like mi-sase is a word in this approach, so are goran-ninar and goran-ninar-ase, and other combinations of verb stems and (s)ase and/or (g )o. . . ninar. On the other hand, the “biclausal” approach assumes that o- . . . ninar is a kind of raising verb that is required to be adjacent to a VP complement (cf. (63a) below). (57a) shows that goran-ninar-ase is apparently turned into a kind of object-honorification ditransitive verb, since Suzuki-sensei-ni is an object. This situation is reminiscent of the situation for reflexives (cf. (31)), where zibun can be bound by the apparent object of the complex causative verb. In the “biclausal” lexicalist approach, the subjecthood of the apparent object can be obtained based on the list of goran-ninar (or goran since ninar is a raising verb and the latter shares the same list as the former, cf. (63a) below) and the object Suzuki-sensei is semantically identical to the subject in the list for goran-ninar.29 When the causative form of a verb is followed by the honorific suffix, we get the following kind of sentence: (58) Suzuki-sensei-ga Ken-ni hon-wo o-kak-ase-ninat-ta. Suzuki-teacher- Ken- book- write--- “Prof. Suzuki (honored) made Ken write a book.” This sentence is unproblematic for both analyses.
5.2
Object honorification
When the object is to be honored, the verb is followed by su. Thus, we have the following sentences involving causatives and object honorification.30 29
30
As with reflexives, a “monoclausal” lexicalist could handle this by making an honorification mechanism have access to the - list of the inner verb of the verb complex by means of the feature or a nested - list. I owe Kaz Fukushima (p.c., 1992) for (59b). Some people find these marginal, perhaps for semantic and pragmatic reasons (because an honored person is caused to do something here). A somewhat more natural sentence than (59b) would be a sentence like the following:
On lexicalist treatments of Japanese causatives
149
(59) a. Ken-ga Suzuki-sensei-ni kooen-wo o-aruk-ase-si-ta. Ken- teacher- park- walk--- “Ken made Prof. Suzuki (honored) to walk through the park.” b. Hanako-ga obottyama-ni gohan-wo o-tabe-sase-si-ta. Hanako- boy()- meal- eat--- “Hanako made the boy (honored) eat the meal.” As first noted by Harada (1976), and later elaborated by Fukushima (1992, 1993), object honorification selects the most oblique object as the person (or thing related to the person) to be honored. Thus, in a double-object construction, it is the indirect (dative-marked) object and not the direct (accusative-marked) object that is honored.31 Thus, even though (60a) is acceptable, (60b) is bad under the intended interpretation. (60) a. Ken-ga Suzuki-sensei-ni Naomi-no sakuhin-wo Ken- Suzuki-teacher- Naomi- work- o-mise-si-ta. show-- “Ken showed Naomi’s work to Prof. Suzuki (honored).” b. *Ken-ga Naomi-ni Suzuki-sensei-no sakuhin-wo Ken- Naomi- Suzuki-teacher- work- o-mise-si-ta. show-- In a double-object construction, the - list of the ditransitive verb mise “show” consists of the subject, the direct object, and the indirect object, in the order of obliqueness (from least oblique to most oblique) and the most oblique object, namely, the indirect object is to be honored. Thus, an apparently similar sentence involving causativization is rejected in both analyses: (61) *Ken-ga Naomi-ni Suzuki-sensei-no sakuhin-wo Ken- Naomi- Suzuki-teacher- work- o-mi-sase-si-ta. see---
(i) Hanako-ga obottyama-ni gohan-wo o-mesiagar-ase-si-ta. Hanako- boy()- meal- eat---- “Hanako made the boy (honored) eat the meal.”
31
where mesiagar is an honorific form of tabe “eat.” Thus, the above is actually a double honorific form; both subject honorification and object honorification are expressed simultaneously. The “biclausal” approach of (59) would choose the VP kooen-wo aruk as the most oblique complement of sase. In order to prevent this VP from being the target of honorification, we assume that what is to be honored is the most oblique nominal complement (i.e., projections of nouns and postpositions); thus Suzuki-sensei-ni is chosen for (59a).
150
In the “biclausal” approach, the most oblique nominal complement for sase is Naomi-ni, resulting in an inappropriate reading. On the other hand, in the “monoclausal” approach, mi-sase is a ditransitive verb just like mise and thus (61) is bad for the same reason as (60): Suzuki-sensei-no sakuhin-wo is not the most oblique object of mi-sase. Consider now the alternative ordering of the two kinds of morphemes: (62) a. *Ken-ga Suzuki-sensei-ni Naomi-wo o-tasuke-s-ase-ta. Ken- Suzuki-teacher- Naomi- help--- b. Ken-ga Naomi-ni Suzuki-sensei-wo o-tasuke-s-ase-ta. Ken- Naomi- teacher- help--- “Ken made Naomi help Prof. Suzuki (honored).” The acceptability of (62b) may entail different consequences for the two analyses. (62b) is straightforward for the “biclausal” approach, as the honorific morpheme s(u) attaches to tasuke, not tasuke-sase, and Suzuki-sensei-wo is the most oblique object in the - list for tasuke “help.” On the other hand, this is not straightforward for the “monoclausal” lexicalist approach of causatives, since Suzuki-sensei-wo is not the most oblique complement for the complex “verb” o-tasuke-s-ase.32
5.3
Honorification morphophonology
So far, we have seen that there are some facts concerning honorification that seem to favor the “biclausal” approach. However, as mentioned in note 28, a major obstacle for the “biclausal” analysis of causatives is the morphophonological shape such as o-kak-ase-ninar “cause to write (subject-honored)” and o-aruk-ase-su “cause to walk (object-honored),” where honorification is involved. In these sequences, the prefix (g )o-, which doesn’t belong to the embedded VP, appears inside the VP. This apparent problem is in fact not a problem if we take into account the potential discrepancy of the phenogrammatical and tectogrammatical levels. Let us assume the existence of (g )o-ninar and (g )o-su as single lexical items. They are adjacent to a VP and a TVP, respectively, at the tectogrammatical 32
One alternative is to make the honorification mechanism have access to the - list of the inner verb tasuke by means of the feature or a nested - list, as mentioned with regard to reflexivization and subject honorification. Another alternative would be to resort to some kind of “derivational” concept. For example, it takes the “output” (so to speak) of the honorification lexical rule: o-tasuke-su, in which the most oblique object (i.e., the object of tasuke “help”) is marked as honored, and then adds a more oblique object (the causee). That is, if the obliqueness is determined sensitive to the stages of the applications of lexical rules, the person to be honored can be determined appropriately. This is obviously an unwelcome move for a lexicalist and relies on essentially the same concepts as those assumed in such derivational analyses as Kitagawa (1986).
On lexicalist treatments of Japanese causatives
151
level.33 A detailed lexical specification for o-ninar and o-su would look like the following:34 (63) a. Subject Honorification Marker o ninar | verb 1 | b. Object Honorification Marker o su | verb 1 |
verb 1 〈p[sbj]〉
verb 1 〈p[sbj], p[obj]〉
That is, o-ninar is adjacent to a VP (a verbal subcategorizing for a subject), and o-su is adjacent to a TVP (a verbal subcategorizing for a subject and an object). Both of these forms are raising verbs in the sense that they inherit the - list of the adjacent complement. A pragmatic constraint associated with ninar will require the subject to be honored and one with su will require the most oblique object to be honored. The Morphonological Principle presented in (43) is slightly complicated to take into account the existence of the prefix: (64) Morphonological Principle (revised) In a headed structure of the form: [ 1 % 〈5〉] [
3 % 〈7〉]
head 2 % 4
8 6
a. if 4 = 〈 〉,
8 . 6 (The last of the phrase is the last of the head. All the other s are obtained by union.)
union(3 % 〈7〉,2,1), 5 =
33 34
A nominal that behaves as a verb when followed by su can also precede (g )o-ninar and (g )o-su. In the following, we omit the case of go-ninar and go-su, the forms mostly used when they are adjacent to nominals, for the sake of expository simplicity. New features and are introduced for an element of a list, adopting the proposal in Bird and Klein (1993) and Riehemann (1993). Note this is different from the one used by the “monoclausal” lexicalist.
152
b. otherwise,
8 . 7·6 (The last of the phrase is the last of the adjacent dependent followed by the last of the head and preceded by the prefix of the head. All the other s are obtained by union.) That is, in general, the value of the lexical head consists of the prefix part (8) and the stem part (6),35 the prefix is appended before the last member of the list of the adjacent complement, if any. In this way, the prefix remains as the prefix of the morphologically complex verb such as aruk-ase.36 Thus, the o- prefix, the morphophonological representation of the head verb, and the morpheme ninar will form a morphological and phonological unit in this order by the revised Morphonological Principle. Let us see some examples. The morphophonological representation of each constituent for (55) will look like the following. Note that o-ninar is taken to be a constituent in the constituent (i.e., tectogrammatical) structure.37 union(3,2,1), 5 =
(65) a.
S PP
VP
NP
P
VP
V
Suzuki-sensei
ga
V
o-ninar
aruk b.
35
36 37
constituent
possible values
[PP Suzuki-sensei ga]
〈Suzuki-sensei·ga〉
[VP aruki]
〈aruki 〉
[V o-ninar]
〈o;ninar〉
[VP[VP aruki] o-ninar]
〈o;aruki·ninar〉
[S[PP Suzuki-sensei ga] [VP[VP aruki] o-ninar]]
〈Suzuki-sensei·ga, o;aruki·ninar〉
If the value contains no part, 8 is simply empty and all the morphophonology is represented by the part 6. In this case, (64) simply reduces to (43). This provision is essentially a revival of head wrapping in Head Grammar (Pollard 1984). In the following, G αJ will be expressed as [α; β]. I βL
On lexicalist treatments of Japanese causatives
153
The current approach can explain the following contrast pointed out by Fukushima (1992). (66) a. *Taroo-ga [[Suzuki-sensei-wo o-hagemasi] sosite Taroo- Suzuki-teacher- encourage and [Honda-sensei-wo o-tasuke]] si-ta. Honda-teacher- help hon.- b. Taroo-ga [[Suzuki-sensei-wo o-hagemasi-si] sosite Taroo- Suzuki-teacher- encourage- and [Honda-sensei-wo o-tasuke-si]] ta. Honda-teacher- help- “Taroo encouraged Prof. Suzuki (honored) and helped Prof. Honda (honored).” In (66a), since there is only one su, there could be only one prefix o- present, because our analysis always gives o and su as a pair. Thus, one of the prefixes (actually, the first one) is not accounted for. On the other hand, (66b) is acceptable as there is a one-to-one correspondence between the prefix and the stem. The current approach predicts the following to be also acceptable. (67) Taroo-ga [[Hanako-wo hagemasi] sosite [Honda-sensei-wo Taroo- Hanako- encourage and Honda-teacher- o-tasuke-si]] ta. help- “Taroo encouraged Prof. Suzuki and helped Prof. Honda (honored).” Note that there is only one prefix o- and only one stem su, which are both within the second conjunct VP. This sentence, incidentally, precludes the following alternative approach to honorification. Suppose we posit a special value, say hono (honorific form), and assume that an honorific form of a verb is defined by the following lexical rule: (68) if
〈γ〉 || verb is a lexical entry, then so is || verb
conj
o γ hono
That is, the honorific form is identical to the conjunctive form of a verb, which is the form to precede the suffix ninar, except for the fact that the
154
value is prefixed by o- and the value is hono. In this approach, the morphophonological shape of the verbal for subject honorification is simply ninar (without the prefix) and that for object honorification is su. This has the advantage that it is consistent with the fact that some lexical items are inherently honorific. For example, goran “see()” can be made to be a lexical item that has inherently the value hono and does not have to be related to another lexical item like ran by the lexical rule in (68). In fact, there is no such verbal (or nominal) in the present Japanese (i.e. no ran as a noun or ransu(ru) as a verb). Another advantage comes from the fact that even though most of the (g )oprefixed forms are used for both subject and object honorification, appearing in positions adjacent to ninar as well as to su, goran can appear only before ninar. That is, goran is used only for subject honorification. For object honorification, haiken “see(),” another inherently honorific lexical item used only for object honorification, appears before su. This distinction could also be expressed in this alternative approach by providing two distinct values, say honos and honoo . On the other hand, as for (66a), this alternative approach will give a wrong prediction. Note that [[Suzuki-sensei-wo o-hagemasi ] sosite [Honda-sensei-wo o-tasuke]] is a coordinated VP, whose conjuncts are in the honorific form, according to this alternative approach. Thus, the coordinated VP itself should be in the honorific form and be allowed to be adjacent to the honorific stem su, making (66a) acceptable. Moreover, the example in (67) causes another problem, since how the value of for the conjoined VP is made to be hono is not clear.
5.4
Honorification and causativization
If a VP is followed by the causative, the prefix o- appears before the head verb of the embedded VP. This is predicted in the current approach and is consistent with the following fact: the position of the honorific prefix is not correlated with the target of honorification; it is solely determined by the position of the honorific stem ninar. Observe the following sentences: (69) a. *Ken-ga Suzuki-sensei-wo o-aruk-ase-ninat-ta. Ken- Suzuki-teacher- walk--- b. Suzuki-sensei-ga Ken-wo o-aruk-ase-ninat-ta. Suzuki-teacher- Ken- walk--- “Prof. Suzuki (honored) made Ken walk.” c. *Suzuki-sensei-ga Ken-wo o-aruki-ninar-ase-ta. Suzuki-teacher- Ken- walk--- d. Ken-ga Suzuki-sensei-wo o-aruki-ninar-ase-ta. Ken- Suzuki-teacher- walk--- “Ken made Prof. Suzuki (honored) walk.”
On lexicalist treatments of Japanese causatives
155
Note that (69a) is bad because the honorific stem ninar, by attaching to the causative sase, shows that the causer, i.e., Ken, is to be honored. Simply attaching the prefix o- to the verb aruk does not cause the walker to be honored. (69b) is good because Prof. Suzuki is the causer, as the honorific stem shows. If, on the other hand, the honorific stem is attached directly to the verb aruk, as in the case for (69c,d), the sentence becomes good when the walker is honored, and bad when the causer is honored. Exactly the parallel situation can be observed for object honorification: (70) a. Ken-ga Suzuki-sensei-ni kooen-wo o-aruk-ase-si-ta. Ken- Suzuki teacher- park- walk--- “Ken made Prof. Suzuki (honored) to walk through the park.” b. *Suzuki-sensei-ga Ken-ni kooen-wo o-aruk-ase-si-ta. Suzuki-teacher- Ken- park- walk--- (70a) (=(59a)) is good because the causee is honored and the honorific stem su shows that the object of causation is to be honored. On the other hand, (70b) is bad since the causee is not honored. Note also the following contrast, already mentioned. (62) a. *Ken-ga Suzuki-sensei-ni Naomi-wo o-tasuke-s-ase-ta. Ken- Suzuki-teacher- Naomi- help--- b. Ken-ga Naomi-ni Suzuki-sensei-wo o-tasuke-s-ase-ta. Ken- Naomi- teacher- help--- “Ken made Naomi help Prof. Suzuki (honored).” For (62), since the honorific stem attaches directly to the embedded verb tasuke, the person who is helped, not the causee, should be honored. From these facts, we can generally say that the position of the prefix o(originating as part of the honorific morpheme) is irrelevant to the semantics and pragmatics of honorification, contrary to the assumption suggested by Iida et al. (1994) and Manning et al. (this volume). Moreover, it is not necessarily attached to the immediately preceding verb stem but is in effect “preposed” so that the prefix remains always a prefix even in a morphophonologically complex sequence. We conclude this section with the complete account of the following sentence based on the revised version of the Morphonological Principle: (69b) Suzuki-sensei-ga Ken-wo o-aruk-ase-ninat-ta. Suzuki-teacher- Ken- walk--- “Prof. Suzuki (honored) made Ken walk.” The constituent structure is the following:
156
(71)
S PP
VP
NP
P
VP
Suzuki-sensei
ga
V TVP o-ninar
PP NP
P
VP
V
Ken
wo
V
sase
aruk We will see how, at the phenogrammatical level, the prefix o- ends up in a position before the main verb that immediately precedes the causative in the following schema: (72)
constituent
possible values
[PP Suzuki-sensei ga]
〈Suzuki-sensei·ga〉
[PP Ken wo]
〈Ken·wo〉
[VP aruk]
〈aruk〉
[V sase]
〈sase〉
[TVP[VP aruk] sase]
〈aruk·sase〉
[VP[PP Ken wo] [TVP[VP aruk] sase]]
〈Ken·wo, aruk·sase〉
[V o-ninar]
〈o;ninar〉
[VP[VP[PP Ken wo] [TVP[VP aruk] sase]] o-ninar]
〈Ken·wo, o;aruk·sase·ninar〉
[S[PP Suzuki-sensei ga] [VP[VP[PP Ken wo] [TVP[VP aruk] sase]] o-ninar]
〈Suzuki-sensei·ga, Ken·wo, o;aruk·sase·ninar〉 〈Ken·wo, Suzuki-sensei·ga, o;aruk·sase·ninar〉
The last two values show the possibility of scrambling of the two complements (the subject Suzuki-sensei-ga and the object Ken-wo).
On lexicalist treatments of Japanese causatives
5.5
157
Remaining issues
Note the difference in acceptability in the sentences below. (73a) involves the verb complex mi-sase, while (73b) the genuine ditransitive verb mise. This is another case that suggests a different status of mi-sase and mise. (73) a. *Suzuki-sensei-ga Ken-ni Naomi-no sakuhin-wo Suzuki-teacher- Ken- Naomi- work- o-mi-sase-ninat-ta. see--- “Prof. Suzuki (honored) made Ken see Naomi’s work.” b. Suzuki-sensei-ga Ken-ni Naomi-no sakuhin-wo Suzuki-teacher- Ken- Naomi- work- o-mise-ninat-ta. show-- “Prof. Suzuki (honored) showed Ken Naomi’s work.” Unlike (58) above, an apparently identical construction involving mi-sase becomes unacceptable. Since any theory that predicts the acceptable status of (58) would also predict the acceptable status of (73a), and both “biclausal” and “monoclausal” approaches predict the acceptable status of (58), the unacceptable status of (73a) above is a problem for both approaches. One could argue that the difference in acceptability for mi-sase and mise here may come from the fact that mise is a much shorter form and preferred by some Gricean principle. This argument may have support from (58), where kak-ase “cause to write” doesn’t have a shorter counterpart. However, even though this line of argument could explain why (73a) is unacceptable, it would, unless properly constrained, also eliminate other acceptable sentences involving mi-sase due to the existence of the shorter form mise (cf. (13), (21), (31), (33a), (51a), where honorification is not involved). Note also that mi-sase and mise do not have exactly the same meaning: the former implies that the person who is made to see something actually sees it but the latter doesn’t. So, you cannot simply substitute the shorter mise for mi-sase in every case. The difference between mi-sase and mise also shows up with respect to the possibility of adding the object-honorification morpheme, shown below, where (60a) is repeated here as (74b): (74) a. *Ken-ga Suzuki-sensei-ni Naomi-no sakuhin-wo Ken- Suzuki-teacher- Naomi- work- o-mi-sase-si-ta. see--- “(Int.) Ken caused Prof. Suzuki (honored) to see Naomi’s work.” b. Ken-ga Suzuki-sensei-ni Naomi-no sakuhin-wo Ken- Suzuki-teacher- Naomi- work- o-mise-si-ta. show-- “Ken showed Naomi’s work to Prof. Suzuki (honored).”
158
Object honorification is impossible for mi-sase as opposed to mise (cf. (60a)). In the “monoclausal” approach, (74a) is potentially a good sentence just like (60a) with mise “show.” On the other hand, for a “biclausal” analysis, Suzukisensei-ni and Naomi-no sakuhin-wo belong to separate lists, and what is relevant to object honorification is the list for the “upper” one, namely that for sase, in which only the subject and the direct object (identified with Suzuki-sensei-ni ) reside. Thus, Suzuki-sensei-ni is the most oblique object and the sentence should be acceptable. Thus, somehow, mi-sase cannot be used with either type of honorification.38 For now, I leave open the problem involving mi-sase and honorification.
6
Conclusion
What I have demonstrated in this paper is that morphophonological (phenogrammatical) information, on one hand, and constituency (tectogrammatical) information, on the other, can be treated in a uniform representation based on feature structures and a constraint-based grammatical framework such as HPSG. In this approach, while morphophonological adjunction is adequately treated so that phonological and morphological information are properly encoded, the proper hierarchical constituent structure concerning such syntactic and semantic information as subcategorization and argument structure is represented in a separate place in the feature geometry. Specifically, what I have discussed in this paper is summarized below: •
•
•
38
Morphology (as well as phonology) has its own level of representation: phenogrammatical structure, which may be independent of the constituent (syntactic and semantic) structure: tectogrammatical structure. In particular, traversing the leaves of a constituent structure tree, which is constructed based on syntactic and semantic information by universal schemata, from left to right doesn’t necessarily have to correspond to the morphophonological representation. The phenogrammatical (morphophonological) structure (in the form of nested list) is constructed in close relationship with tectogrammatical (syntactic and semantic) information. In particular, it is sensitive to adjacency and subcategorization. In this approach, the causative sase can have a VP embedded in its tectogrammatical structure, while, at the same time, the phenogrammatical representation is constructed so as to respect the morphophonological integrity between the stem of the head verb of the embedded VP and the causative.
This doesn’t seem to be a lexical idiosyncrasy of mi or mise, as ki “put on” and kise “cause to put on, dress” behave in exactly the same manner.
On lexicalist treatments of Japanese causatives
• •
159
Scrambling is considered to be a purely phenogrammatical phenomenon and can be quite independent of the tectogrammatical level.39 In the constituent structure of honorification, the prefix o- and the stem ninar or su form a constituent at the tectogrammatical level. The prefix, however, is “preposed” in the phenogrammatical representation to keep the morphophonological integrity between the verb stem and the honorification stem.
Although I have discussed only the causative and the honorifics, other bound morphemes such as the passive rare are also analyzed in exactly the same manner.
References Bird, S. and Klein, E. 1993. Enriching HPSG Phonology. Research paper EUCCS/RP-56, Centre for Cognitive Science, University of Edinburgh. Borsley, R. C. 1987. Subjects and complements in HPSG. Technical report CSLI-87-107. Stanford: Center for the Study of Language and Information. Dowty, D. R. 1996. Towards a minimalist theory of syntactic structure. In Discontinuous Constituency. Proceedings of the Tilburg Conference on Discontinuous Constituency, ed. H. Bunt and A. van Horck. Berlin: Mouton de Gruyter. 11– 62. Fukushima, K. 1992. Subcategorization, feature structure, and honorification in Japanese. Paper presented at Western Conference on Linguistics 22. Fukushima, K. 1993. S, , , and honorification in Japanese. Paper presented at HPSG Miniconference, July 24–25, 1993, Ohio State University. Gunji, T. 1987. Japanese Phrase Structure Grammar. Dordrecht: Reidel. Gunji, T. 1995. An overview of JPSG. In Japanese Sentence Processing: Proceedings of International Symposium on Japanese Syntax Processing, held in October 1991 at Duke University, ed. R. Mazuka and N. Nagai. Hillsdale, NJ: Lawrence Erlbaum. 105–133. Gunji, T. and Hasada, K., eds. 1998. Topics in Constraint-Based Grammar of Japanese. Dordrecht: Kluwer. Gunji, T. and Matui, M. F. 1995. Elements, structure, and constraint: toward a constraintbased phonology of Japanese. Unpublished manuscript, March 23, 1995. Graduate School of Language and Culture, Osaka University. Harada, S.-I. 1973. Counter EQUI NP deletion. Annual Bulletin, Research Institute of Logopedics and Phoniatrics, University of Tokyo. 113–147. Harada, S.-I. 1976. Honorifics. In Syntax and Semantics, ed. M. Shibatani, vol. 5: Japanese Generative Grammar. New York: Academic Press. 499–561. Hinrichs, E. and Nakazawa, T. 1994. Linearizing AUXs in German verbal complexes. In German in Head-driven Phrase Structure Grammar, ed. J. Nerbonne, K. Netter, and C. Pollard CSLI Lecture Notes no. 46. Stanford: Center for the Study of Language and Information. 11–37. 39
Proposals in the derivational theory in which some of the syntactic movements are “undone” at LF could be understood to be the transformational interpretation of the facts pointed out here (cf. Saito (1989) for scrambling and Kitagawa (1986) for causativization and honorification).
160
Iida, M. 1992. Context and Binding in Japanese. Ph.D. dissertation, Stanford University. Published 1996 Stanford: CSLI Publications. Iida, M., Manning, C., O’Neill, P., and Sag, I. A. 1994. The lexical integrity of Japanese causatives. Unpublished paper presented at the 68th meeting of the Linguistics Society of America. Department of Linguistics, Stanford University. Inoue, K. 1976. Henkei Bunpô to Nihongo (“Transformational Grammar and Japanese”). Tokyo: Taishukan. In Japanese. Kathol, A. 1995. Linearization-Based German Syntax. Ph.D. dissertation, The Ohio State University, Columbus, Ohio. Kitagawa, Y. 1986. Subjects in Japanese and English. Ph.D. dissertation, University of Massachusetts, Amherst. Published 1994. New York: Garland. Kuno, S. 1988. Blended quasi-direct discourse in Japanese. In Papers from the Second International Workshop on Japanese Syntax, ed. W. J. Poser. Stanford: Center for the Study of Language and Information. 75–102. Kuroda, S.-Y. 1965. Causative forms in Japanese. Foundations of Language 1: 30–50. Manning, C. and Sag, I. A. 1995. Dissociations between argument structure and grammatical relations. Unpublished working draft as of June 17, 1995. Stanford: Center for the Study of Language and Information. Manning, C., Sag, I. A., and Iida, M. (this volume). The lexical integrity of Japanese causatives. McGloin, N. H. 1976. Negation. In Syntax and Semantics, ed. M. Shibatani, vol. 5: Japanese Generative Grammar. New York: Academic Press. 371– 419. Miyagawa, S. 1980. Complex Verbs and the Lexicon. Ph.D. dissertation, University of Arizona, Tucson. Available as Coyote Papers: Working Papers in Linguistics from A → Z, vol. 1, University of Arizona. Muraki, M. 1978. The sika nai construction and predicate restructuring. In Problems in Japanese Syntax and Semantics, ed. J. Hinds and I. Howard. Tokyo: Kaitakusha. 155–177. Pollard, C. J. 1984. Generalized Context-Free Grammars, Head Grammars and Natural Language. Ph.D. dissertation, Stanford University. Pollard, C. J. and Sag, I. A. 1987. Information-based Syntax and Semantics, vol. 1: Fundamentals. CSLI Lecture Notes Series no. 13. Stanford: Center for the Study of Language and Information. Pollard, C. J. and Sag, I. A. 1994. Head-Driven Phrase Structure Grammar. Chicago: University of Chicago Press and Stanford: Center for the Study of Language and Information. Reape, M. 1996. Getting things in order. In Discontinuous Constituency. ed. H. Bunt and A. van Horck Berlin: Mouton de Gruyter. 209–253. Riehemann, S. 1993. Word Formation in Lexical Type Hierarchies: A Case Study of bar-Adjectives in German. Master’s thesis, University of Tübingen. Saito, M. 1989. Scrambling as semantically vacuous A′ movement. In Alternative Conceptions to Phrase Structure, ed. M. R. Baltin and A. S. Kroch Chicago: University of Chicago Press. 182–200. Sells, P. 1994. The projection of phrase structure and argument structure in Japanese. Unpublished manuscript, Stanford University, August, 1994. Shibatani, M. 1973. Semantics of Japanese causativization. Foundations of Language 9: 327–373.
4. “Modal flip” and partial verb phrase fronting in German1 . Carnegie Mellon University
1 Introduction The purpose of this paper is to integrate several analyses of German verbal phenomena, namely verb second (V2), modal flip, and partial verb phrase (PVP) fronting. I will build on the literature from head-driven phrase structure grammar (HPSG), including primarily the analyses of Pollard and Sag (1987, 1994), Pollard (1996), Nerbonne (1986, 1994), and Hinrichs and Nakazawa (1989, 1994). The central idea proposed in this paper is that the same range of partial VPs are involved in both fronting and modal flip constructions, and that flat structures suffice otherwise. This should provide an account of a broader range of modal flip data than was covered by Hinrichs and Nakazawa. By blocking the appearance of (P)VPs in the German Mittelfeld, the field of nonfronted verb arguments and adjuncts, I avoid the spurious ambiguity in the matrix clause entailed by Pollard (1996). I argue below for an account of verbal phenomena in HPSG which has the advantage over previous accounts of • establishing a common phrase structure for PVPs in modal flip contexts and in fronted contexts; • simplifying the subcategorization requirements for auxiliaries, eliminating ambiguous lexical entries for auxiliary, which lead to spurious ambiguity; 1
First I would like to acknowledge Carl Pollard, who introduced me to HPSG and supported me over the time that this paper was a work in progress. Special thanks goes to Lori Levin, who advised and encouraged me during the writing process. The detailed comments of Alex Franz, Andreas Kathol, Tsuneko Nakazawa, John Nerbonne, Detmar Meurers, and three anonymous reviewers enabled me to produce my final draft. Thanks to the following people for their time in discussing the paper: Bob Carpenter, Dan Everett, Ted Gibson, and Brad Pritchett. And thanks to my German informants, Andreas Kathol, Daniela Lonsdale, Thomas Polzin, and Bernhard Suhm.
162
.
•
offering a uniform treatment of the constraint on subcategorized PVPs, that they must include their governed verbs; offering for the first time a hypothesis about the lexical structure for German verbs with nonagentive subjects in HPSG. Key in this analysis is my proposal that grammatical subjects of these verbs are underlyingly complements in the lexicon.
•
The paper proceeds as follows: first, I introduce the German data which bears on the analysis. Next, I summarize an HPSG account of V2, giving a flat rule schema for sentence and verb phrase which is in line with this account. My account follows Pollard and Sag (1994: ch. 9), which assumes distinguished and features instead of a single feature for subcategorized complements. Finally, I show how the modal flip data and the PVP data are accounted for by introducing into the grammar a lexical template for auxiliary and a lexical rule for PVP fronting.
2
Some phenomena from German
2.1 V2 Verb second, or V2, is the phenomenon in languages such as German, Dutch, and Yiddish, such that the verb always appears second in matrix clauses. It has been well documented in the literature that the first constituent in a German matrix sentence may be almost any part of speech (Uszkoreit 1987a: section 1.5).2 In line with Uszkoreit (1987a), Nerbonne (1986, 1994), and Pollard (1996), this paper assumes that all matrix sentences in German are verb-initial, and that constituents which precede the head verb in a matrix sentence are either a) the result of fronting, or topicalization, or b) adjuncts. The idea of postulating an underlying verb-initial sentence structure goes back at least to den Besten (1983) working in a transformational framework. A few examples of German V2 sentences follow. (1) Er wird das Buch lesen. He[nom] will the book[acc] read “He will read the book.” (2) Das Buch wird er lesen. The book[acc] will he[nom] read “He will read the book.” (3) Lesen wird er das Buch. Read will he[nom] the book[acc] “He will read the book.” 2
Uszkoreit notes that Drach (1963) provides a detailed account of the sentential elements which may precede the finite verb in a German sentence.
“Modal flip” and PVP fronting in German
(4)
Dann wird er das Buch lesen. Then will he[nom] the book[acc] read “Then he will read the book.”
(5)
In diesem Zimmer wird er das Buch lesen. in this[dat] room will he[nom] the book[acc] read “He will read the book in this room.”
2.2
Double infinitive
163
The double infinitive construction (see den Besten and Edmondson 1983) is a German constituent comprised of the infinitival forms, or base forms, of an auxiliary verb and a main verb. Double infinitive occurs in verb-final position, and the auxiliary follows the main verb. (6)
Er wird das Examen bestehen können. He will the exam pass[bse] be-able-to[bse] “He will be able to pass the exam.”
In (6), bestehen können is the double infinitive. Bestehen is the main verb and können is a modal auxiliary. A double infinitive may consist of a main verb and a verb which takes a VP complement, such as sehen (“see’’), hören (“hear’’) or lassen (“let”). For example, singen hören (“sing hear”) is a double infinitive. There may be more than two infinitives in the “double” infinitive, as shown in (7). (7)
Sie wird Cecilia die Nilpferde füttern dürfen lassen. She will Cecilia the[acc] hippos feed be-allowed-to let “She will let Cecilia be allowed to feed the hippos.”
We claim that the double infinitive is a constituent in German in contexts including at least verb-final sentences. One argument for the constituency of the double infinitive is that a finite auxiliary may “flip” over a double infinitive (see section 2.3) but may not come between a base form verb and a base form auxiliary. (8)
a. daß er das Examen wird bestehen können that he the exam will pass be-able-to b. daß er das Examen bestehen können wird that he the exam pass be-able-to will c. *daß er das Examen bestehen wird können that he the exam pass will be-able-to
Another observation is that, in a V2 sentence, the double infinitive may be fronted:
164
.
(9)
Bestehen können wird er das Examen. pass be-able-to will he the exam “He will be able to pass the exam.”
However, the main verb infinitive may also be fronted in a V2 sentence without the infinitive auxiliary.3 (10) Bestehen wird er das Examen können. pass will he the exam be-able-to “He will be able to pass the exam.” This ability to “split” the double infinitive via the fronting of a base form verb, and the contrast between (8c) and (10), suggest that we will need to look at double infinitive constituency a bit more closely.
2.3
Modal flip
Normally, in German subordinate clauses, the finite verb comes at the end of the clause. For example, in (11), the finite auxiliary follows a double infinitive: (11) Ich wußte, daß er das Examen bestehen können würde. I knew that he the exam pass[bse] be-able-to[bse] would[fin] “I knew that he would be able to pass the exam.” Modal flip (see Johnson 1986) occurs in German when the auxiliary in an embedded clause precedes its double infinitive complement. Example (12), a grammatical alternative to (11), is from Hinrichs and Nakazawa (1994: (1b)): (12) Ich wußte, daß er das Examen würde bestehen können. I knew that he the exam would[fin] pass[bse] be-able-to[bse] “I knew that he would be able to pass the exam.” Modal flip is interesting because the finite auxiliary in the “flipped” case separates the main verb from its complements. This fact contradicts any grammar that proposes that the double infinitive, in a complex without the finite auxiliary, heads a contiguous VP. We also note that modal flip does not (usually) occur around a single base form infinitive:4 3
The auxiliary verb cannot be fronted by itself: *können wird er das Examen bestehen be-able-to will he the exam pass
4
This may be a special property of bare auxiliaries. Nerbonne (1994) restricts this sentence with a condition added into his /-PVP rule. (See section 5.1 of this paper.) Hinrichs and Nakazawa (1994: 6a) offer the following as a grammatical sentence, citing den Besten and Edmondson (1983): (i)
Weil er nicht anders hat können. because he not otherwise has be-able-to “Because he couldn’t do differently.”
The analysis in this paper does not admit this sentence as grammatical.
“Modal flip” and PVP fronting in German
165
(13) *Ich glaube, daß er wird kommen. I believe that he will come[bse] For some auxiliaries, such as werden (“will”), modal flip is optional; for others, such as haben (“have”), modal flip is obligatory. The reader is referred to Bech (1995) and Hinrichs and Nakazawa (1994) for detailed accounts of modal flip phenomena, which are quite intricate. As pointed out in Kroch and Santorini (1991), Hinrichs and Nakazawa (1994), and Nerbonne (1994), there may in some cases be a constituent between the governing auxiliary and the double infinitive: (14) daß er ihnen hätte alles schicken sollen that he them would-have everything send should “that he should have sent them everything” In their account with movement, Kroch and Santorini (1991) describe a rule of syntactic lowering that moves quantified or emphatically stressed NPs, in sentences in which modal verbs appear in the perfect tense (e.g. hätte in (14)). In one example, a quantified subject lies between the finite auxiliary and the double infinitive (Kroch and Santorini 1991: (60c)): (15) daß gestern hätte keiner kommen dürfen that yesterday would-have nobody[nom] come be-allowed-to “that nobody would have been allowed to come yesterday” The ability to have one or more NPs between the flipped auxiliary and the double infinitive seems to vary across speakers. Hinrichs and Nakazawa (1994) allow for intervening NPs only when the auxiliary in the double infinitive takes a VP complement.5 We suggest that sentences (14) and (15) are examples of extraposition. We use the term extraposed to describe a VP that occurs after the position of the final tensed verb. In (14), the extraposed phrase is alles schicken sollen (“should send everything”) and in (15) it is keiner kommen dürfen (“nobody be allowed to come”). We introduce these two phrases as partial verb phrases, or PVPs, which are phrases consisting of a verb and some of the verb’s arguments.
2.4
Partial verb phrase fronting
Partial verb phrases have been observed in the first position of a German sentence.6 In (16 –18) we show sentences that begin with PVP.7 5 6 7
So sentences (14) and (15) are not admissible by Hinrichs and Nakazawa (1994). Den Besten and Webelhuth (1990) refer to the phenomenon as remnant topicalization. We follow Nerbonne (1994) and Pollard and Sag (1994: ch. 9) in not showing a trace in the main clause for the fronted PVP. The rule which Nerbonne uses to admit sentences with fronted PVPs in the grammar is given in section 5.1. Also note Haider (1990) with its claim that PVPs are base-generated in topic position.
166
.
S PVP
S/PVP
NP[acc]
V[bse]
[fin]
NP[nom]
NP[dat]
Das Märchen
erzählen
wird
er
ihr
Figure 1 Tree structure for fronted partial verb phrase (16) Das Märchen erzählen [wird er ihr ]S/PVP . The fairy-tale[acc] tell [will he[nom] to-her[dat]]. “He will tell the fairy-tale to her.” (17) Ihr erzählen [wird er das Märchen ]S/PVP . to-her[dat] tell [will he[nom] the fairy-tale[acc]]. “He will tell her a fairy-tale.” (18) Das Examen bestehen [wird er können]S/PVP . The exam[acc] pass [will he[nom] be-able-to]. “He will be able to pass the exam.” The tree structure for (16) is given in figure 1. As we do not wish to delve too far into the analysis, we will postpone the exposition of the flat S structure in this figure to section 3.2. In (16), there is a PVP constituent, das Märchen erzählen (“tell the fairy-tale”), in which the verb, erzählen (“tell”), and the accusative object, das Märchen (“the fairy tale”), have been fronted without the verb’s dative object, ihr (“her”). A full VP would normally consist of the main verb and all of the verb complements, e.g. ihr das Märchen erzählen (“tell her the fairy-tale”). In (17), the verb has been fronted with the dative object rather than with the accusative object. Sentences (16) and (17) are both grammatical, but their acceptability seems to vary across speakers, as documented in Heidolph et al. (1981) and Uszkoreit (1987b).
2.5
Partial verb phrases and spurious ambiguity
Pollard (1996) notes that it is a fundamental premise of a phrase structure grammar that constituent structure be linguistically significant. Our assumption here is that any phrase structures introduced in the grammar must be well motivated. Furthermore, two different phrase structures may not exist for the same bundle of information; there must be a semantic, phonological or pragmatic basis for two syntactic representations for the same utterance.8 8
Rich Thomason (p.c.) has suggested that there might in fact be other, purely syntactic indications of ambiguity. I leave this issue open to investigation.
“Modal flip” and PVP fronting in German
167
S [inv+]
NP[nom]
NP[dat]
NP[acc]
V[bse]
wird
er
ihr
das Märchen
erzählen
Figure 2 Auxiliary raising all verb complements S [inv+]
wird
NP[nom]
er
NP[dat]
ihr
PVP NP[acc]
V[bse]
das Märchen
erzählen
Figure 3 Auxiliary subcategorizing for PVP in main clause We use the term spurious ambiguity to refer to ambiguous phrase structures in the grammar which have no basis for distinction.9 In our theory we need a phrase structure rule for PVP in the grammar so that the constituents which are fronted in (16) and (17) are defined. However, as shown in Pollard (1996), there is a potential problem with spurious ambiguity in the grammar when a rule for PVP coexists in the grammar with raising by the auxiliary verb. The reader is referred to chapter 1 of this volume for a discussion of raising by structure sharing among feature structures. We note that that discussion is somewhat parochial for English. In German and in other languages, it is possible to raise not only subjects via the mechanism of structure-sharing, but complements as well. When the grammar admits PVPs into the matrix clause and has auxiliaries which are raising verbs, the possibilities for auxiliary subcategorization are the following: 1. 2.
3.
9
The auxiliary subcategorizes for a verb, raising all of the verb’s complements. (See figure 2.) The auxiliary subcategorizes for a PVP, raising the complements of the head of the PVP which are not part of the PVP. That is, the head verb’s complements are not daughters of the PVP. (See figure 3.) The auxiliary subcategorizes for a full VP, raising no complements of the head of the VP. (See figure 4.)
Spurious ambiguity arises in HPSG when two different tree structures have the same value for , the attribute that contains syntactic and semantic information.
168
.
S [inv+]
NP[nom]
wird
er
VP NP[dat]
NP[acc]
V[bse]
ihr
das Märchen
erzählen
Figure 4 Auxiliary subcategorizing for VP in main clause Another problem noted by Pollard is that the spurious ambiguity introduced by PVPs compounds with multiple auxiliaries in a single sentence. The reader can imagine the various levels of raising which are possible with multiple auxiliaries. In the approach we pursue below, PVPs do not appear in syntactic environments where there is independent justification for another phrase structure, e.g. a flat phrase structure. In this respect we follow Nerbonne (1994), who, citing Haider (1990), argues convincingly that the topicalization of a PVP constituent is not proof positive of its existence in the Mittelfeld, observing that “constituents” can be so identified only with respect to their position in a phrase. Since our analysis is syntactic, we will not be looking at the semantic, phonological or pragmatic factors which may bear on the issue.
2.6 Unaccusative verbs We now briefly review unaccusativity in German verbs, in order to provide background for the classification of verbs which may be fronted in PVP with their subjects. The Unaccusative Hypothesis formulated by Perlmutter (1978) states that there are two types of intransitive verbs, unaccusative and unergative verbs.10 An unaccusative verb is so-called since it cannot take an object with accusative case; two German examples are fallen (“fall”) and ankommen (“arrive”). In Government Binding theory, this restriction follows from the lexical property of such verbs that they select a d-structure object but no external argument. Perlmutter makes the generalization that unaccusative verbs fail to undergo impersonal passivization in German. Example (19) is from Pollard (1994: (9a,b)): (19) a. Der Zug ist angekommen. The train[nom] has arrived b. *Hier ist angekommen worden. Here has arrived been “Here has been arrived.” 10
For fuller discussion see Levin and Hovav (1995).
“Modal flip” and PVP fronting in German
169
There are some syntactic diagnostics for unaccusativity, which do not categorize unaccusative verbs neatly. These diagnostics include the inability to form the impersonal passive, the formation of the adjectival passive, and auxiliary verb selection. Kathol (1992), citing Dowty (1991) and Zaenen (1988), points out that German unaccusative verbs are characterized semantically (in part) by nonagentive properties for their subjects. Levin and Hovav (1995) would argue that the semantic and syntactic characterization of these verbs is interrelated; something that is semantically caused may be syntactically active. The formation of the passive, a diagnostic for these verbs, is a syntactic construction. The reader is referred to Kathol (1992), Zaenen (1988), Levin and Hovav (1995), etc. since it is beyond the scope of this paper to fully address this topic.
2.7
Fronting base form verbs with subjects
Uszkoreit (1987a) attributes to Haider (p.c.) the observation that certain German verbs may be fronted with their subjects. We show a fronted subject in (20). These “certain” verbs have subjects which are not agentive. (20) Ein wirklicher Fehler unterlaufen [war ihm noch a real mistake[nom] occur [was to-him[dat] still nie] never] “He never made a real mistake.” (Uszkoreit 1987a: (14a)) In the normal case, fronting subject together with verb is not permitted: (21) *Er erzählen [wird ihr das Märchen]. He[nom] tell [will to-her[dat] the fairy-tale[acc]] “He will tell her the fairy tale.” And the verb unterlaufen (“occur”) in (20) cannot undergo impersonal passivization: (22) *Ihm wurde von einem wirklichen Fehler noch nie unterlaufen. to-him has by a real mistake still never occurred The classification of verbs which do front their subjects may include not only unaccusative verbs but other verbs as well. In (23) we show the verb ausmachen (“affect”/“matter”). The passives in (23b) and (23c) are ungrammatical, while fronting is questionable in (23d) and (23e). (23) a. Ihm hat der Unfall nichts ausgemacht. to-him[dat] has the accident[nom] nothing[acc?] mattered “The accident did not affect him.” b. *Ihm wurde von dem Unfall nichts ausgemacht. to-him has by the accident[dat] nothing mattered
170
.
c. *Nichts wurde ihm von dem Unfall ausgemacht. nothing has to-him[dat] by the accident[dat] mattered d. *?Der Unfall ausmachen wird ihm nichts. The accident[nom] matter will to-him[dat] nothing “The accident will not affect him.” e. ??Nichts ausmachen wird ihm der Unfall. nothing matter will to-him[dat] the accident[nom] “The accident will not affect him.” f. So viel ausmachen kann ihm das nicht. So much matter be-able to-him[dat] that[nom] not “That can’t bother him very much.” Ausmachen may be a verb for which case assignment goes along with semantic role assignment; such verbs with quirky case do not passivize. (See Belletti and Rizzi 1988.) We will follow Perlmutter (1978) in showing no underlying subjects for such verbs, suggesting an HPSG lexical template in section 5.4.
3
Phrase structure rules
Beginning in this section we present our HPSG grammar fragment for German. We will use the version of HPSG in Pollard and Sag (1994: ch. 9). In particular, we will use the features and for subjects and complements, in addition to a single list for all arguments. We also use Pollard and Sag’s treatment of filler-head constructions that does not include traces in main clauses. A rule fashioned after their Complement Extraction Lexical Rule will be introduced later, for PVP fronting. In section 3.1 we review a grammar fragment for V2. In section 3.2 we introduce the phrase structure rule that we will use for German sentence structure, and the notion of a “subjectless” analysis of a main clause. Finally, in section 3.3, we refine our phrase structure rule such that it becomes the single rule in our grammar for sentence, VP and PVP.
3.1
ID rules
In HPSG, the set of allowable phrase structures is specified by a small set of Immediate Dominance schemata, or ID rules. We review how the feature interacts with ID schemata in German to result in V2 sentences. We follow the analysis of V2 in Pollard (1996), which uses the binary head feature . captures the fact that there are two different verbal positions in German: either preceding the subject and complements, or following the subject and complements. We show + and − sentences in figures 5 and 6, respectively. The Filler-Head rule schema (Pollard and Sag 1994: schema 6) and Head-Adjunct rule schema (Pollard and Sag 1994: schema 5) introduce the constituent daughters and . and can be
“Modal flip” and PVP fronting in German
171
S [inv+]
NP[nom]
NP[acc]
V[bse]
Wird
er
das Buch
lesen?
Figure 5 Tree structure for + sentence S NP[nom]
NP[acc]
V[bse]
[inv−]
er
das Buch
lesen
wird
Figure 6 Tree structure for − sentence S
NP[nom]
S/NP [fin,inv+]
NP[acc]
V[bse,inv−]
wird
das Buch
lesen
er
Figure 7 Tree structure for Filler-Head schema S
Heute
:S[inv+] [fin,inv+]
NP[nom]
NP[acc]
V[bse,inv−]
wird
er
das Buch
lesen
Figure 8 Tree structure for Head-Adjunct schema sisters to a phrasal head which is an + sentence (but an adjunct daughter is not limited to S sisters). The filler precedes the head, and the adjunct may precede the head. V2 sentences result. Our sample tree diagrams for these two rules are in figures 7 and 8.
172
.
S[marked]
daß
:S[inv−] NP[nom]
NP[acc]
er
das Buch
V[bse,inv−] [fin,inv−] lesen
wird
Figure 9 Tree structure for Head-Marker schema :[〈[1]〉]
weil
:s[inv−][[1]]
NP[nom]
NP[acc]
er
das Buch
V[bse,inv−] [fin,inv−] lesen
wird
Figure 10 Tree structure for Head-Complement schema
The Head-Marker schema (Pollard and Sag 1994: schema 4) introduces complementizers or markers before − sentences. For example, daß (“that”) is a marker which subcategorizes for an − sentence. See figure 9. The head in the Head-Marker schema is a phrasal head. The use of the Head-Complement schema with a head daughter which is a subordinate conjunction, and a complement which is a − sentence, creates a subordinate clause. We show a subordinate clause in figure 10, where we prefix the attribute value matrix, or AVM, of the head with H, and of the complement, with C. The clause may be an adjunct which combines with an + sentence using the Head-Adjunct schema. The Head-Complement schema takes a lexical head.
3.2
ID rule R2: flat S structures
Analyses of scrambling date to Ross’s dissertation (1967), and to the description of German scrambling phenomena in Bierwisch (1963). In a transformational framework, one generalizes about scrambling phenomena that constituents are moved out of their underlying positions and adjoined to the front of a VP, IP (inflectional phrase, or, sentence) or AP (adjective phrase) (von Stechow and Sternefeld 1988, Grewendorf and Sternefeld 1990). Accounts of free word
“Modal flip” and PVP fronting in German
173
order without movement, such as the analysis offered here, take a different approach. We include phrase structure rules in the grammar which can generate subjects in either their canonical or scrambled positions. Nerbonne (1986, 1994), Uszkoreit (1987a), and Pollard (1996) propose flat structures for German sentences. One of the reasons for this is the manifestation of scrambling in the German Mittelfeld.11 We will use the term scrambling somewhat loosely to refer to the free ordering of subject and other complements of V. (24) Dann wird die Pille der Doktor dem Patienten geben. Then will the pill[acc] the doctor[nom] to-the patient[dat] give “Then the doctor will give the pill to the patient.” The arguments of geben (give) in sentence (24), including the subject Doktor, may be permuted (Uszkoreit 1987a: (93a–b)): (25) a. Die Pille gibt der Doktor dem Patienten. the pill[acc] gives the doctor[nom] to the patient[dat] b. Dem Patienten gibt die Pille der Doktor. to-the patient[dat] gives the pill[acc] the doctor[nom] The feature geometry of HPSG, as discussed in chapter 1 of this volume, ensures that the auxiliary may specify head features only for the heads of its subcategorized-for arguments, which are the members of its and list, and not for any of those arguments’ daughters. If the auxiliary were to subcategorize for some projection of a nonfinite verb which included the subject,12 then the auxiliary would not be able to specify the case of the subject. Suffice it to say that in order for the finite verb to specify nominative case for its subject argument, the two must be sisters. The subject cannot be the daughter of a verbal projection of a base form verb. This paper will follow Nerbonne, Uszkoreit, and Pollard in its assumption that all finite verbs and finite auxiliaries in German head flat S structures. We call the ID rule for flat S R2 since the name of the head-complement schema for English in Pollard and Sag (1994) is R2. In (26), and in our ID rules which follow, we are using a shorthand notation for the rule in that the elements of the list of the head, marked with 〈C0, . . . , Cn〉, ought actually to be the values of the complement daughters C0, . . . , Cn. As we introduce PVPs into the grammar, this rule will be revised. (26) R2. Head-complement rule or flat rule. [COMPS〈 〉] ⇒ HEAD[COMPS〈C0, . . . ,Cn〉], C0, . . . ,Cn 11
12
Nerbonne has reminded me that he proposes a flat S structure as well because he finds constituent structure tests to be contradictory in indication. This was proposed in Uszkoreit (1987b).
.
174
S[COMPS〈 〉] V [FIN,COMPS〈C0, . . . , Cn 〉]
C0
C1
...
Cn
Figure 11 Tree structure for R2: Head-Complement Rule Figure 11 displays a tree structure for ID rule R2. The rule R2 holds good for both VP and S. In the case of VP, the values of the mother and of the head daughter will both be unsaturated.13 In the case of sentences (i.e. finite matrix clauses), we are adopting Borsley’s “subjectless” analysis for main clauses in Welsh (Borsley 1989), which allows the subject of a finite V to be undistinguished among complements in S. Borsley argues that what are usually considered subjects of finite VSO clauses in Welsh are actually least oblique complements. The subjectless analysis is characterized by the lexical rule in (27), which takes as input a base form verb lacking a subject and yields a finite verb not subcategorizing for a subject. Simply put, the subject moves to the list. (27) Lexical rule for “subjectless” analysis of main clauses. Adapted from Pollard and Sag (1994: ch. 9, (16)) category category : verb : verb :bse → :fin :1 :〈 〉 :2 :1 + 2 We use the lexical rule in order that finite auxiliaries can participate in our R2. In an application of rule R2 to S, the values of the mother and of the head daughter will be empty lists. In our German grammar fragment we do not have a rule for S → NP VP, which is P&S Schema 1, the Head-Subject Schema. Schema 1 exists, for example, in English. Both word order variation and the desire to avoid ambiguity in matrix clauses are reasons to include but one phrase structure for sentence, R2, in the grammar. R2 is a head-complement schema. We admit, however, that we have not investigated whether we might need Schema 1, a head-subject schema, or a head-subject-complement schema, in the full grammar, for any other kinds of German phrases (i.e. nonverbal phrases).
ID rule R2 ′: partial verb phrase rule
3.3
Rule R2 (26), as it stands, will not create the (P)VP constituents which may appear either in the first (fronted) position or last (extraposed) position of a 13
Except in the case of verbs which do not subcategorize for a subject; see section 5.4.
“Modal flip” and PVP fronting in German
175
German sentence. We need to relax the rule so that it will include the head verb and some number of the head’s nonsubject complements. We will call our revised rule (28) R2′ since it is a variant of the Head-Complement Schema, R2 (26). This rule supersedes rule (26). It includes all phrases licensed by (26) but also allows nonsaturated phrases. Like R2, then, R2′ is a single phrase structure rule for both S and VP, but it also includes PVP. (28) R2′: Head-complement rule or flat rule. A phrase whose daughters are one head daughter and one or more complement daughters. The head daughter is a word. [COMPS〈C′0, . . . , C′p 〉] ⇒ HEAD[COMPS〈C0, . . . , Cn〉], C ″0, . . . ,C″q (n ≥ 0, q, p ≤ n) 〈C 0′, . . . , C p′ 〉 ( 〈C″0, . . . , C″q 〉 = 〈C0, . . . , Cn〉14 〈C 0′, . . . , C p′〉 lists the unsaturated complements of the PVP and C ″0, . . . , C″q is the sequence of the complements of the PVP. We use ( for the sequence union operator. Sequence union, which was originally suggested for German word order by Reape (1996), is a relation that holds over three lists. By taking elements from each of two lists, in turn, in their original relative order, one obtains the resultant list. Essentially, it means that we are not limited to a strict list append 15 to combine the PVP’s complement daughters and the PVP’s outstanding complements to form the list of the head verb. For example, if the list of the head verb is 〈NP[nom],NP[acc],NP[dat]〉 then sequence union allows 1 and 2 to be instantiated as follows: 1 = 〈NP[nom],NP[acc],NP[dat], 2 = 〈 〉 1 = 〈NP[nom],NP[acc]〉, 2 = 〈NP[dat]〉 1 = 〈NP[nom]〉, 2 = 〈NP[acc],NP[dat]〉 1 = 〈NP[acc]〉, 2 = 〈NP[nom],NP[dat]〉 1 = 〈NP[nom],NP[dat]〉, 2 = 〈NP[acc]〉 1 = 〈 〉, 2 = 〈NP[nom],NP[acc],NP[dat]〉 1 = 〈NP[acc],NP[dat]〉, 2 = 〈NP[nom]〉 1 = 〈NP[dat]〉, 2 = 〈NP[nom],NP[acc]〉 A tree structure for ID rule R2′ is given in figure 12. Rule R2′ allows a verbal head of a PVP to take, for example, either an accusative object or a dative object as a single complement. Strictly speaking, since R2′ is a flat rule, the PVP’s complement daughters are not in a list, but an unordered sequence constrained by the obliqueness hierarchy. The operator ( relaxes the manner in which the complements are “picked up” by the head. We want to be able to form a PVP from a head verb with some sequence of its complements, but 14
15
My thanks to Carl Pollard for advice about the formulation of R2′, and to an anonymous reviewer for further suggestions. The operation on two lists such that their contents are added together in order, first one list and then the other, to form a new list.
176
.
VP[COMPS1] V [BSE,COMPS1 ( 2]
C 0″
C 1″
...
C q″
2 = 〈C 0″, . . . ,C q″〉 Figure 12 Tree structure for R′: Partial Verb Phrase Rule we do not want to be required to choose the least oblique argument first. In (17) (repeated below as (29)) we form a PVP from a head verb alone with its dative argument: (29) Ihr erzählen [wird er das Märchen ]S/PVP . to-her[dat] tell [will he[nom] the fairy-tale[acc]] “He will tell her a fairy-tale.” In (30) and (29) ihr erzählen (“to-her tell”) is a valid PVP as is das Märchen erzählen (“the fairy-tale tell”). The head verb in the PVP (here, erzählen) will not be required to choose the accusative object first just because it comes first on the list. (30) Das Märchen erzählen [wird er ihr]S/PVP . The fairy-tale[acc] tell [will he[nom] to-her[dat]] “He will tell the story to her.” (31) Ihr das Märchen erzählen [wird er]S/PVP . to-her[dat] the fairy-tale[acc] tell [will he[nom]] “He will tell the story to her.” The grammaticality of the PVP ihr das Märchen erzählen (31) in comparison with the ungrammatical PVP *das Märchen ihr erzählen is determined by linear precedence rules, namely the German rule that pronominal NPs precede nonpronominal phrases. Note that Uszkoreit (1987a) has claimed that LP rules for constituents of the Mittelfeld hold across PVP. The ID rule R2′ states which constituents may be chosen at one time by the head verb; it does not dictate the final ordering of the complements when they are selected all at once in a flat constituent. The relative ordering of unsaturated complements on the list remains intact, preserving their relative obliqueness.16
4
Modal flip
In section 4.1 we critique the analysis of modal flip given in Hinrichs and Nakazawa (1989) and Hinrichs and Nakazawa (1994). We follow Hinrichs and Nakazawa’s now-standard assumption (1989) of argument inheritance by auxiliary verbs, claiming that a German auxiliary raises all of its complement 16
See Uszkoreit (1987a) for a nonmovement treatment of the order of constituents in the German Mittelfeld.
“Modal flip” and PVP fronting in German
177
verb’s arguments, structure-sharing them within a single S node. In our grammar fragment for modal flip (section 4.2), we present a set of linear precedence rules for German and a lexical specification for auxiliary subcategorization which avoids spurious ambiguity and which allows a modal to “flip” over an NP. Some remaining problems are noted in section 4.3.
4.1
Hinrichs and Nakazawa’s account of modal flip
Modal flip (section 2.3) motivates Johnson (1986) and Hinrichs and Nakazawa (1989, 1994) to build a constituent from the finite auxiliary plus the double infinitive. Hinrichs and Nakazawa have proposed that, when an auxiliary appears between the verb complex and its complements, the auxiliary and the double infinitive form a verb complex in the syntax which subcategorizes for the main verb’s complements. By their account, the auxiliary is a raising verb. We show the lexical entry for werden following Hinrichs and Nakazawa (1994) in (32). The auxiliary subcategorizes for a verbal head and all of the head’s unsaturated complements. In (32) the feature indicates whether a verbal complex has picked up any NP complements. The complement of the auxiliary must be −; that is, it is a verb or a verb complex which has not picked up any NP complements. (32) Entry with Raising for auxiliary verb werden (Hinrichs and Nakazawa 1994): werden verb : :bse :+ :1 verb : :bse :2 + :2 :1〈NP〉 :− Hinrichs and Nakazawa use a binary branching tree structure to form a verbal complex from the auxiliary verb and the double infinitive. Figures 13 and 14 are examples of their tree structures for cases of double infinitive with modal flip and without modal flip, respectively (Hinrichs and Nakazawa 1994: (11a) and (11b)). The authors assume that the governing auxiliary in these two examples, werden, will pick up the double infinitive before any other complements. This is because the verbal complex is the most oblique complement of the auxiliary, and, therefore, the last thing on the auxiliary’s list.17 17
Also, list, in this paper.
178
.
V3 NP
V2 NP
V1[aux+][fin] V0
Peter
das Buch
wird
V1[aux+][bse] V0
V0
finden
können
Figure 13 Double infinitive with modal flip (Hinrichs and Nakazawa 1994: (11a)) V3 NP
V2 V1[aux+][fin]
NP
V1[aux+][bse]
Peter
das Buch
V0
V0
finden
können
V0
wird
Figure 14 Double infinitive without modal flip (Hinrichs and Nakazawa 1994: (11b)) There are three kinds of data that Hinrichs and Nakazawa, using the tree structures in figures 13 and 14 and the lexical entry in (32), have left either unclarified or unaccounted for. These are the three cases of double infinitive which we list below: Case 1. The finite auxiliary may be an + auxiliary in V1 or V2 position. See figure 15. This is not a genuine problem for Hinrichs and Nakazawa’s analysis, since a lexical entry for an auxiliary with raising, such as is given in (32), will work for both + and − sentences. However, Hinrichs and Nakazawa state that they expect the finite auxiliary to pick up a verbal complement first among its complements, since the verbal complement is last on the list.
“Modal flip” and PVP fronting in German
179
S[inv+] [fin][inv+]
Wird
NP[nom]
Peter
NP[acc]
das Buch
V[aux+][bse] V0
V0
finden
können?
Figure 15 V1 sentence with double infinitive Since finite auxiliaries and verbal complexes are clearly discontinuous in the case of V2, this means that they will need a flat S structure in their grammar, such as our rule R2′ (28), in which the auxiliary combines with all of its complements at once.18 No other phrase structure would allow a + finite auxiliary to pick up the verbal complex first among its complements. Case 2. Sometimes, some NP complements appear between the double infinitive and the flipped auxiliary. Recall (14), repeated here as (33): (33) daß er ihnen hätte alles schicken sollen that he them would-have everything send should “that he should have sent them everything” Hinrichs and Nakazawa would not be able to admit sentence (33) in their grammar, since neither haben nor sollen may pick up a PVP complement which includes the NP alles, given the − constraint on the verbal complements of these auxiliaries. They do allow “flipping” over a VP or PVP in limited cases involving VP-complement-taking verbs such as sehen, hören, lassen, helfen. These verbs are marked in Hinrichs and Nakazawa (1994) with the feature +. In figure 16 the + verb helfen occurs with a PVP complement, in a sentence without modal flip (the auxiliary werden flips optionally). The figure models the relevant part of (34): (34) daß du uns die Schlacht gewinnen helfen wirst that you[nom] us[dat] the[acc] battle win[bse] help[bse] will[fin] “that you will help us win the battle” Alternatively, verbs such as helfen may, like the other auxiliaries in the Hinrichs and Nakazawa (1994) analysis, subcategorize for a verb or verbal complex that has not picked up any NP complements; these are marked − (figure 17). 18
Other alternatives include: a Head Movement account; the treatment of linearization based on order domains (Reape 1996).
180
.
VP V
VP V
VP NP
V
die Schlacht
gewinnen
helfen
wirst
Figure 16 Verb “helfen” with PVP VP NP
V V
die Schlacht
V
V
V
gewinnen
helfen
wirst
Figure 17 Verb “helfen” with lexical We see from the contrasting phrase markers in figures 16 and 17 that spurious ambiguity arises in Hinrichs and Nakazawa (1994) when a VPcomplement-taking verb appears with an auxiliary, such as werden, which participates optionally in modal flip, and with argument inheritance by the auxiliary. As a general methodological principle, our analysis rules out this type of ambiguity. We also claim that it is still necessary to allow auxiliaries to flip over NP complements, given (33). Case 3. A base form infinitive, optionally with some NP complements, may be fronted away from a base form auxiliary. (35) shows a double infinitive and (36) shows how the double infinitive is “split.” (35) Wird Cecilia das Nilpferd füttern dürfen? Will Cecilia[nom] the hippo[acc] feed be-allowed-to “Will Cecilia be allowed to feed the hippo?” (36) Das Nilpferd füttern [wird Cecilia dürfen]S/VP . the hippo feed [will Cecilia be-allowed-to] “Cecilia will be allowed to feed the hippo.”
“Modal flip” and PVP fronting in German
181
In (36), füttern forms a VP with das Nilpferd. füttern has not formed a double infinitive with dürfen. This is also a case of an + finite auxiliary. A fronted VP (36) shows that we do not want to require the base form auxiliary and the base form main verb to form a constituent. Otherwise, we will have to explain why the constituent sometimes appears split. We will successfully handle the three cases of double infinitive by restricting the subcategorization for auxiliary to be PVP only in cases such as (12) and (14) where the VP occurs after the final tensed verb, with raising of all complements of the head verb otherwise. Our specification for (36), a sentence with a base form VP fronted away from a base form auxiliary, will be introduced in section 5.
4.2
Revised account of modal flip: subcategorizing for PVP
4.2.1 LP rules for modal flip Linear precedence constraints, or LP rules, specify constraints on the relative order of sisters, including heads, subjects, complements, adjuncts, markers, and fillers. For the analysis for modal flip, we need to state the headcomplement ordering for verbs both in the verb-final position (−) and in the verb-initial position (+). And, principally, we must describe how the feature , which marks modal flip, affects word order. We will assume the following feature declarations for the sort verb. Following Hinrichs and Nakazawa (1994), we will use the head feature . Our feature will mark in the lexicon whether an auxiliary participates in modal flip.19 verb :vform verb: :boolean :boolean :boolean We will use vp as a subsort of synsem which has the following template: synsem : verb vp: : [bse ∨ prt] : [list(synsem)] : [list(synsem)] Note that VPs do not necessarily have unsaturated values; some verbs may not subcategorize for in the lexicon. 19
This is in contrast with Hinrichs and Nakazawa, who use the feature for verbs which trigger auxiliary flip, i.e., when the governing verb is flipped.
182
.
The three LP constraints given below facilitate our analysis. They describe an ordering between a head and a complement of the head, and assume the local tree as the domain of application. •
+ verbs, which are the subset of − verbs which participate in modal flip, precede their base form PVP complements, and follow everything else.20 + verbs include werden (“will”) and haben (“have”). : verb :+ :+
a verb :bse
[: [¬verb]] a : verb :+ :+ verb a :¬bse •
+ verbs precede their subjects and complements. (Refer also to figure 5.) : verb :+ :+
•
: verb :+ :+
a⊥
− verbs follow their subjects and complements. (Refer also to figure 6.) verb [: [¬verb]] a : :− :− :+
: verb :−
20
verb :− a : :+ :− :+
This rule may be extendible to infinitive form (zu-infinitive) VPs as well; this is a matter for future research.
“Modal flip” and PVP fronting in German
•
183
The finite auxiliary follows base form auxiliaries. verb :− : :+ :− :¬fin
verb :− a : :+ :− :fin :+
4.2.2 Simplified auxiliary In this section we revise the lexical treatment of auxiliary introduced in section 4.1. First, we would like to simplify the lexical entry for auxiliary in such a way that there are no entries which give rise to spurious ambiguity. We will say that auxiliaries which do not participate in modal flip subcategorize for lexical heads (+ heads) and obligatorily raise all of the complement verb’s complements. This means that most auxiliaries will not subcategorize for PVP or VP. Our lexical entry for a − auxiliary is given in (37). Recall the competition between figures 2, 3, and 4 in section 2.5; the tree diagram admissible in our grammar is figure 2. (37) Revised lexical entry for auxiliary with a lexical head and raising of all complements: verb :bse : :+ :− :〈1〉 word :2 +
: verb :bse ∨ prt :2 :〈1NP〉 :+
Our representation of (34) is given in figure 18. In this figure, there are no (P)VP constituents in the sentence structure. The governing verbs werden and helfen raise all complements.21 21
We have not explored whether VP-complement-taking verbs like helfen should be required, like regular auxiliaries, to raise all of their complements. See also Kiss (1994).
184
.
S[marked][inv−]
S[inv−]
5NP[nom] 4NP[dat] 3NP[acc] daß
du
uns
2V[bse] 1V[bse] [fin]
die Schlacht gewinnen helfen
wirst
wirst: [COMPS : 〈5,4,3,2,1〉] helfen: [COMPS : 〈4,3,2〉] gewinnen: [COMPS : 〈3〉] Figure 18 Flat representation of double infinitive without modal flip
S[inv+] [fin][inv+]
NP[nom]
NP[acc]
V[bse]
V[aux+][bse]
Wird
Peter
das Buch
finden
können?
Figure 19 V1 sentence with double infinitive: revised representation Second, we point out that our lexical entry for auxiliary in (37) works equally well for + and − auxiliaries. Our representation of an + sentence with a double infinitive is given in figure 19. Compare this with figure 15 (the analysis of Hinrichs and Nakazawa). We differ from Johnson (1986) and Hinrichs and Nakazawa (1989, 1994) in not making a constituent from the double infinitive, but leaving it undistinguished among the complements in the flat sentence structure. We are claiming that double infinitive is not a constituent in the case of V2. We use the evidence of a partial verb phrase fronting away from a governing auxiliary (38) to illustrate that a main verb should be free among the auxiliary’s complements. Therefore, we leave the main verb free also in (39). (38) Das Nilpferd füttern [wird Cecilia dürfen]. the hippo feed will Cecilia be-allowed-to “Cecilia will be allowed to feed the hippo.” (39) Das Nilpferd [wird Cecilia füttern dürfen]. the hippo will Cecilia feed be-allowed-to “Cecilia will be allowed to feed the hippo.’’
“Modal flip” and PVP fronting in German
185
S[marked][inv−]
S[inv−][:7]
6NP[nom] 5NP[dat] 7[flip+][inv−]
daß
er
ihnen
1PVP[HEAD:2]
4NP[acc]
3V[bse]
2[bse]
alles
schicken
sollen
hätte
schicken: [COMPS : 〈5,4〉] sollen: [COMPS : 〈5,4,3〉] 1PVP: [COMPS : 〈5〉] hätte: [COMPS : 〈6,5,1〉]
Figure 20 + auxiliary with PVP (40) Lexical entry for modal flip auxiliary subcategorizing for PVP: verb : :+ :+ :〈1〉
:2 +
vp : verb :bse ∨ prt :+ :2 :〈1NP〉
Third, we still would like double infinitive to be a constituent for cases of modal flip, because the flipped auxiliary may not fall in the middle of a double infinitive (refer to (8)). We do this by requiring auxiliaries which flip to subcategorize for a (P)VP which has an auxiliary verb head. We give our lexical entry for a + auxiliary in (40). We want the PVP to the right of a flipped auxiliary to be a single constituent, with nothing to the right of the governing auxiliary raised out of the PVP. Otherwise, we will have spurious ambiguity. The analysis shown in figure 20 is the only analysis possible for (33) given the LP constraint that NP complements cannot appear to the right of a +, − auxiliary. The lexical entry for haben follows the template in (40) for a + auxiliary. haben subcategorizes for its subject er and a PVP headed by sollen, and the indirect object ihnen raised by sollen. The inflected form hätte is instantiated with the right hand side of the “subjectless” lexical rule in (27)
186
.
so that its grammatical subject is on its list. sollen in turn is a − auxiliary which follows the template in (37), and subcategorizes for a single lexical verb schicken, its direct object alles and indirect object alles. Any phrase on the list of the lower verb that is not realized as a complement daughter can appear on the list of the mother. Therefore, the following sentences are also predicted to be grammatical given the auxiliary specifications in (37) and (40): (41) a. daß er ihnen alles hätte schicken sollen b. ??daß er alles hätte ihnen schicken sollen c. ??daß er hätte ihnen alles schicken sollen Only (41a) is clearly acceptable among German speakers. A general concern is that there are varying levels of acceptability among speakers for PVPs in position after a finite auxiliary. See Hinrichs and Nakazawa (1994) for grammaticality judgments on finding NPs between the flipped auxiliary and the double infinitive. The subcategorization of a + auxiliary may be one place where the restrictions on PVP complements can be specified. Furthermore, the acceptability of PVPs in the fronted position also varies. There may be a relation between the fronted PVPs which are acceptable for a given speaker and the PVPs which the same speaker can “flip” over. This is a topic for more research. One could examine which constraints on PVPs hold in the two contexts.22
4.3
Remaining problems for the modal flip analysis
We have a few problems to note. The first is that the PVP which a + auxiliary subcategorizes for must not only be headed by an auxiliary, but also must have picked up the verb it governs. This is not guaranteed by the lexical entry in (40). (42) *. . . daß er bestehen wird [das Examen können]PVP that he pass will the exam be-able-to “. . . that he will be able to pass the exam” It turns out that we do not want the PVP in (42) to appear in a fronted context, either. We will suggest some solutions to this problem after we have presented our account of PVP fronting. The second problem is that we may want subjects in extraposed VPs, at least when the subjects are quantified; recall (15), repeated here as (43): 22
In contrast to this analysis, Nerbonne (1994) leaves a finite flipped auxiliary free in position in S, subject to LP constraints, with a complex formed for the double infinitive. He claims that his is an equally plausible hypothesis to explain the appearance of an NP complement to the right of an − auxiliary (as in 33). We feel that extraposition of PVP is a better explanation of the facts because it leaves open a link between the constituency of what can be fronted and that of what can be “flipped over.” Although, our LP rules cannot relate the constituent daughters of an extraposed PVP and the constituent daughters of the matrix S, which could be problematic.
“Modal flip” and PVP fronting in German
187
(43) daß gestern hätte keiner kommen dürfen that yesterday would-have nobody[nom] come be-allowed-to “that nobody would have been allowed to come yesterday” Our grammar fragment, as it stands, doesn’t have an ID rule for combining head and subject into a phrase. It could be that kommen (“come’’) is actually an unaccusative verb in (43), which would tie in rather nicely with our proposed handling of subjects in fronted PVPs (to be discussed in section 5.4). In the meantime, we need to explain how it is that PVPs appear at all in the fronted position, since we have claimed that regular auxiliaries subcategorize not for a PVP but for a lexical head. This is our task in the next section.
5
Partial verb phrase fronting
As we built on Hinrichs and Nakazawa’s analyses in the previous section, so will we work from the analysis of Nerbonne (1994) for partial verb phrase fronting (sections 5.1 and 5.2). This leads to formulating constraints on PVP in section 5.3, which entails a further revision of the lexical entry for auxiliary. In section 5.4 we account for the data for verbs with nonagentive subjects, introducing a “subjectless” lexical entry for these verbs.
5.1
Nerbonne’s account of PVP fronting
Nerbonne (1994) proposes a rule which operates on auxiliary verbs and passes the verbal head of a VP complement of the auxiliary, plus some number of the head’s complements, into . The rule is copied as (44) below. (44) Nerbonne’s /-PVP Lexical Rule (Nerbonne 1994): synsem local category : : : verb :+ :1{ . . . , 2 [vp-bse], . . .} synsem local category : : : verb :+ :(1 ø 3) – 2 nonlocal : : bse-verb-ss :3 Condition: for 2 a modal-V-sign, ¬∃V [ 3
→
188
.
Rule (44) assumes that, given an auxiliary entry which subcategorizes for VP, there also exists a lexical entry for the same auxiliary where there is a PVP in . The PVP subcategorizes for some subset of the complements of the head of the VP. Any unsaturated complements of the PVP are raised to become complements of the auxiliary verb in the matrix S. The condition on the rule prohibits an auxiliary verb from appearing in in case it governs a verb in . With his lexical rule Nerbonne gives two phrase structure rules, shown in (45) (H stands for head, and C*, for any number of complements). Nerbonne’s first rule is for a flat sentence, with subject and complements satisfied. His second rule is like our R2, and allows a partial VP, VP or S. However, this phrase has to be +, meaning that it occurs in (at least) fronted position. (45) Nerbonne’s PS rules: a. synsem local category : : :{ } :{ } b.
→ H,C*
synsem :
local content : :+
→ H,C*
There is no constraint on the left side of rule (45b) on the or values. Nerbonne’s auxiliaries can still alternatively subcategorize for either a VP, or a V plus the V’s complements. It happens that all (P)VPs must be +, and so a VP can actually only be found in fronted position (in , not ).
5.2
Revising Nerbonne’s PVP rule: auxiliaries subcategorize for lexical heads
In order to use Nerbonne’s PVP fronting rule (44) with our rule R2′ (28), and with our obligatory raising of all V complements, we must revise the PVP rule. We will change the input to the rule so that it matches our lexical entry, i.e. so that the input auxiliary subcategorizes for a verb that is +. We show the rule as (46). We will never have VPs in a matrix S in because − auxiliaries subcategorize for lexical heads. Rule (46) says that VPs must appear in . But, we allow PVPs to exist by our Schema R2′.
“Modal flip” and PVP fronting in German
(46) Revised /-PVP Lexical Rule (first version): synsem local category verb :1 :+ : category : verb :2 + 3 + :4 :bse ∨ prt :+ :3 synsem local category : : :1 :2 + 5 nonlocal vp : : :4 :5
189
→
There must be no satisfied subject in the input to the rule, since the verb is +, nor in the output of the rule, since we assume that our lexical rule makes explicit all changes to the input, leaving all other features of the input unchanged. Our PS rule R2′ for VPs and PVPs remains general – we have no “+” restriction in our PVP rule, as appears on the left-hand side of Nerbonne’s PS rule (45b). We don’t think there should be such a restriction in the ID schemata, as we believe that PS rules should define the constituent structures in a language without constraining where they occur. Generality in PS rules enables cross-language comparisons. Also, we point out that we have only one PS rule in the grammar compared with his two, which makes things simpler. A tree diagram for an application of our revised PVP rule (46) is in figure 21. Figure 21 shows the tree structure and subcategorization for (47). (47) Ein Märchen erzählen [kann er seiner a fairy-tale[acc] tell[bse] [can[fin] he[nom] to-his[dat] Tochter.]S/PVP daughter] “He can tell his daughter a fairy-tale.” The lexical rule for the “subjectless” analysis, (27), must also apply to können, moving its onto , before the PVP rule applies. Note that rule (46) works also for cases of double infinitive fronting, since PVPs include double infinitive PVPs:
190
.
S PVP[4]
S/PVP
2NP[acc]
5V[bse]
[fin]
1NP[nom]
3NP[dat]
ein Märchen
erzählen
kann
er
seiner Tochter
category vp ein Märchen :5 :5 verb erzählen: :bse erzählen: 4 :〈1〉 :〈1NP [nom]〉 :〈3〉 :〈2NP [acc],3NP[dat]〉 synsem local verb : : :fin kann: :+ :〈1,3〉 nonlocal :
:{4} Figure 21
S analysis for PVP fronting
(48) Füttern dürfen wird Cecilia das Nilpferd. feed be-allowed-to will Cecilia the hippo “Cecilia will be allowed to feed the hippo.” A tree diagram for (48), also showing subcategorization for the verbs, is given in figure 22.
5.3
Constraints on PVP
The PVP in (42), a PVP which has an auxiliary but no verb, is ungrammatical in a fronted context. (49) *Das Examen können [wird er bestehen]S/PVP . the exam be-able-to will he pass “He will be able to pass the exam.” As we noted above in section 5.1, Nerbonne adds a condition into his rule for PVP fronting to prevent the appearance of auxiliary in just in case it governs a verb in . Likewise, we don’t want our PVP rule, which makes use of sequence union, to unite a base form auxiliary and a noun in
“Modal flip” and PVP fronting in German
191
S PVP[5]
S/PVP
3V[bse]
4[bse]
[fin][inv+]
1NP[nom]
2NP[acc]
füttern
dürfen
wird
Cecilia
das Nilpferd
category vp :3 verb :4 füttern dürfen: 5 füttern: :bse :〈1〉 :〈1NP [nom]〉 :〈2〉 :〈2NP [acc]〉 category verb :4 :bse dürfen: :+ :〈1〉 :〈2,3〉
synsem local verb : : :fin wird: :+ :〈1,2〉 nonlocal :
:{5}
Figure 22 S analysis for PVP fronting of double infinitive a phrase, while skipping over a verb. We could adopt a condition similar to Nerbonne’s into the output of our PVP fronting rule (46): Condition: ¬∃V [ 2 The same condition would hold for the subcategorization requirements of the auxiliary in (40). The condition, however, appears to be on PVPs generally, rather than on the lexical entries looking for them. As an anonymous reviewer suggests, another way that the constraint on PVP might be expressed is more in line with recent work (Sag 1997) which makes greater use than previous versions of HPSG of types for phrases, and takes advantage of constraints on those types. This would be to view subcategorized pvp as a subtype of vp. We could constrain the value of pvp via a type constraint. This solution more clearly demonstrates that the same type of PVP may be involved in both fronting and “flip” constructions; in this case the constraint is on the PVP itself, rather than in the coincidental subcategorization requirements of the PVP lexical rule and of the + auxiliary lexical entry. In this solution, we have the following template for pvp, a proposed subsort of synsem and vp:
192
.
synsem : verb pup: :[bse ∨ prt] :[list(synsem)] :1 Condition: ¬∃V [ 1 Thus, our final revised versions of the PVP-fronting lexical rule and the modal flip auxiliary are given as follows, in (50) and (51): (50) /-PVP Lexical Rule (final version): synsem local category verb : :+ : category : verb :1 + 2 + : :bse ∨ prt :+ :2
synsem local category : : : verb :+ :1 + 2 nonlocal : : pvp :2
→
(51) Lexical entry for modal flip auxiliary subcategorizing for PVP (final version): verb : :+ :+ :〈1〉 :2 +
pvp :2 :〈1np〉
“Modal flip” and PVP fronting in German
193
An interesting question for future investigation is whether the condition prohibiting a verbal element on the list of a PVP is the best explanation for the data in (42) and (49). There are at least two possible alternatives: •
•
5.4
One could separate the verbal complements from the other complements, by further splitting the valence lists into and . One would then require to be saturated for . would be similar to the feature proposed by Chung (1993). As an anonymous reviewer suggests, there may perhaps be a more general constraint that whenever a head has a complement from which arguments are attracted, then that complement must actually be realized as a complement daughter with that head.
Verbs with nonagentive subjects
A remaining problem is that, by using schema R2′ for fronted (P)VPs, we still have not allowed for the fronting of nonagentive subjects of verbs with their governing heads. Nerbonne (1994) notes fronted subjects as an unresolved problem for his analysis. We find this problem to be resolvable. The lexical structure which we present in this section has been motivated by the underlying structure for unaccusatives given in Perlmutter (1978), originally in the framework of Relational Grammar. This is that the grammatical subjects of unaccusative verbs are underlyingly objects. We suggest that for verbs which have nonagentive subjects, their nominative arguments are underlyingly complements on the list rather than subjects on the list. (52) Lexical specification for verbs with nonagentive subjects: category verb : :bse :〈 〉 np : ,... :nom The AVM in (52) is in the spirit of the lexical rule for the subjectless analysis of matrix clauses (27), but the verb is a base form, not a finite form. Verifying that (52) is the proper lexical skeleton for some set of German verbs is a task for future research. For example, the nonagentiveness of the subject could be indicated via semantic features. Example (52) is in line with the account of the German passive in Pollard (1994). Pollard describes a constraint on the subcategorization of the passive auxiliary, such that there must be a referential subject (i.e. a nondummy subject) in the list for the auxiliary’s main verb complement. In (53) we have the lexical entry for the German passive auxiliary werden.
194
.
(53) Passive auxiliary werden (Pollard (1994: 40), simplified): category verb : :bse :2 category verb :3 + : :part :〈NP[str]ref 〉 :2 + 3 Given the description of the German passive auxiliary in (53), it would then follow from (52) that verbs such as unterlaufen (occur) cannot undergo passivization. Recall (22), repeated here as (54): (54) *Ihm wurde von einem wirklichen Fehler noch nie unterlaufen. to-him has by a real mistake still never occurred Our revision of Nerbonne’s PVP rule (50) can still apply even when the feature of VP is an empty list. In this case, the subject will be the first complement on the list structure shared by the raising auxiliary and the complement verb. We repeat (20) here as (55). A tree diagram and feature structures for (55) are given in figure 23. (55) Ein wirklicher Fehler unterlaufen [war ihm noch a real mistake[nom] occur [was to-him[dat] still nie]. never] “He never made a real mistake.” (Uszkoreit 1987a: (14a))
6
Conclusion
We have shown an account of modal flip and partial verb phrase (PVP) fronting in which we have required German auxiliaries to raise all of a verb’s complements. We have been able to account for the same data for modal flip and PVP fronting accounted for in Hinrichs and Nakazawa (1989), Hinrichs and Nakazawa (1994), and Nerbonne (1994). We summarize the main points of the paper: •
We have shown that the only places that PVPs occur in German are extraposed after an − auxiliary or fronted before an + auxiliary. We have used a single, flat, head-complement schema for German sentence and (P)VP. Our phrase structure rule for PVP is more general than Nerbonne’s because it is not restricted to + contexts.
“Modal flip” and PVP fronting in German
195
S PVP[3] 1NP[nom]
S/PVP
4V[bse] V[aux+,fin]
ein wirklicher Fehler unterlaufen
war
2NP[dat] ihm
noch
nie
vp :4 ein wirklicher Fehler unterlaufen: 3 :〈 〉 :〈2〉 category : verb unterlaufen: 4 :bse :〈 〉 :〈1NP [nom],2NP [dat]〉 synsem local verb : : :fin war: :+ :〈2〉 nonlocal :
:{3} Figure 23 S analysis for subject fronting with PVP Rule •
•
Rather than forming a complex from the governing auxiliary together with the double infinitive, we limit double infinitive to cases of modal flip. Data supports the premise that + auxiliaries take PVP complements while − auxiliaries do not. − auxiliaries subcategorize for lexical heads. We have accounted for a wider range of modal flip sentences than allowed by Hinrichs and Nakazawa. We allow PVP fronting of a subject to occur only when the subject is listed on the list of a verb in the lexicon.
We have laid some groundwork for future research. The points in the paper which merit some additional investigation are these: •
The acceptability of PVPs in modal flip contexts or in fronted position varies across speakers. PVPs may also be subject to pragmatic constraints. We would like to explore whether the PVPs in fronted position are
196
•
•
• •
.
subject to the same constraints as PVPs which have been extraposed. Particularly, we would like to explore the constraints on the appearance of subjects in extraposed PVPs. PVPs do not exist without the presence of the governed verb. There is more than one way to exhibit this constraint in HPSG, and the exact nature of the constraint is worth further exploration. The relation between unaccusative verbs and the set of verbs with nonagentive subjects, and the underlying structures for these in HPSG, is a topic for future research. Using a version of HPSG with a distinguished feature for subject, we have proposed for the first time that the subjects of these verbs may be listed with nonsubject complements. We would like to expand the grammar for auxiliaries which take infinitive form (P)VPs, or zu-infinitive (P)VPs, which may also be extraposed. The claim that PVPs do not exist in matrix clauses could be tested against data from other parts the grammar besides the syntax, such as facts from phonology.
References Bech, Gunnar. 1955. Studien über das deutsche Verbum infinitum. KDVS35,2. Belletti, Adriana, and Luigi Rizzi. 1988. Psych-verbs and Θ-theory. Natural Language and Linguistic Theory 6: 291–352. Bierwisch, Manfred. 1963. Grammatik des deutschen Verbs. Berlin: Akademie Verlag. Borsley, Robert D. 1987. Subjects and complements in HPSG. Technical Report CSLI-107-187. Stanford: Center for the Study of Language and Information. Borsley, Robert D. 1989. Phrase-structure grammar and the barriers conception of clause structure. Linguistics 27: 843– 863. Chung, Chan. 1993. Korean auxiliary verb constructions without VP-nodes. In Proceedings of the 1993 Workshop on Korean Linguistics, ed. Susumo Kuno et al. Harvard Studies in Korean Linguistics, vol. 5. 274 –286. den Besten, H. 1983. On the interaction of root transformations and lexical deletive rules. In On the Formal Syntax of the Westgermania. Papers from the Third Groningen Grammar Talks, Groningen, January 1981, ed. W. Abraham. Amsterdam and Philadelphia: Benjamins. 47–138. den Besten, H., and Jerold A. Edmondson. 1983. The verbal complex in Continental West Germanic. In On the Formal Syntax of the Westgermania. Papers from the Third Groningen Grammar Talks, Groningen, January 1981, ed. W. Abraham. Amsterdam and Philadelphia: Benjamins. den Besten, Hans, and Gert Webelhuth. 1990. Stranding. In Grewendorf and Sternefeld (eds.). 77– 92. Dowty, David. 1991. Thematic proto-roles and argument selection. Language 67(3): 547– 619. Drach, Erich. 1963. Grundgedanken der deutschen Satzlehre. Wissenschaftliche Buchgesellschaft. Gazdar, Gerald. 1981. Unbounded dependencies and coordinate structure. Linguistic Inquiry 12: 155–184.
“Modal flip” and PVP fronting in German
197
Grewendorf, Günther, and Wolfgang Sternefeld (ed.). 1990. Scrambling and Barriers. Amsterdam and Philadelphia: Benjamins. Haider, Hubert. 1990. Topicalization and other puzzles of German syntax. In Grewendorf and Sternefeld (eds.). 43–112. Heidolph, K. E., W. Faemig, and W. Motsch. 1981. Grundzüge einer deutschen Grammatik. Berlin: Akademie Verlag. Hinrichs, Erhard W., and Tsuneko Nakazawa. 1989. Flipped out: AUX in German. In Papers from the 25th Regional Meeting of the Chicago Linguistic Society. Chicago: Chicago Linguistic Society. Hinrichs, Erhard W., and Tsuneko Nakazawa. 1994. Linearizing AUXs in German verbal complexes. In Nerbonne, Netter, and Pollard (eds.). 11–37. Johnson, Mark. 1986. A GPSG account of VP Structure in German. Linguistics 24: 871– 882. Kathol, Andreas. 1992. Unaccusative mismatches in German. In Proceedings of the Second Formal Linguistics Society of Mid-America Conference (1991), ed. Matt Alexander and Monika Dressler. University of Michigan and University of Wisconsin, Madison. 74 – 88. Kiss, Tibor. 1994. Obligatory coherence: an investigation into the syntax of modal and semi-modal verbs in German. In Nerbonne, Netter, and Pollard (eds.). 71–107. Kroch, Anthony S., and Beatrice Santorini. 1991. The derived constituent structure of the West Germanic verb-raising construction. In Principles and Parameters in Comparative Grammar, ed. Robert Freidin. Cambridge, MA: MIT Press. 269 –338. Levin, Beth, and Malka Rappaport Hovav. 1995. Unaccusativity: At the Syntax-Lexical Semantics Interface. Linguistic Inquiry Monograph 26. Cambridge, MA: MIT Press. Nerbonne, John. 1986. “Phantoms” and German fronting: poltergeist constituents? Linguistics 24: 857– 870. Nerbonne, John. 1994. Partial verb phrases and spurious ambiguities. In Nerbonne, Netter, and Pollard (eds.). 109 –150. Nerbonne, John, Klaus Netter, and Carl Pollard (eds.). 1994. German Grammar in HPSG. Stanford: Center for the Study of Language and Information. Perlmutter, David. 1978. Impersonal passives and the unaccusative hypothesis. In Papers from BLS4. 157–189. Pollard, Carl. 1994. Toward a unified account of German passive. In Nerbonne, Netter, and Pollard (eds.). 273–296. Pollard, Carl J. 1996. On head nonmovement. In Discontinuous Constituency, ed. Harry Bunt and Arthur van Horck. Berlin: Mouton de Gruyter. 279–305. Pollard, Carl, and Ivan Sag. 1987. Information-Based Syntax and Semantics, vol. 1: Fundamentals. CSLI Lecture Notes no. 13. Stanford: Center for the Study of Language and Information. Pollard, Carl, and Ivan Sag. 1992. Anaphors in English and the scope of binding theory. Linguistic Inquiry 23: 261–303. Pollard, Carl, and Ivan Sag. 1994. Head-Driven Phrase Structure Grammar. Chicago: University of Chicago Press and Stanford: Center for the Study of Language and Information. Reape, Mike. 1996. Getting things in order. In Discontinuous Constituency, ed. Harry Bunt and Arthur van Horck. Berlin: Mouton de Gruyter. 209 –253. Ross, John Robert. 1967. Constraints on Variables in Syntax. Ph.D. dissertation, Massachusetts Institute of Technology. Published as Infinite Syntax!. 1986. Norwood, NJ: Ablex.
198
.
Sag, Ivan A. 1997. English relative clause constructions. Journal of Linguistics 33(2): 431– 484. Uszkoreit, Hans. 1987a. Linear precedence and discontinuous constituents: complex fronting in German. In Discontinuous Constituents, ed. G. J. Huck and A. E. Ojeda. Syntax and Semantics, vol. 20. London: Academic Press. 406– 427. Uszkoreit, Hans. 1987b. Word Order and Constituent Structure in German. CSLI Lecture Notes no. 8. Stanford: Center for the Study of Language and Information. von Stechow, Arnim, and Wolfgang Sternefeld. 1988. Bausteine syntaktischen Wissens. Westdeutscher Verlag. Zaenen, Annie. 1988. Unaccusativity in Dutch: An integrated approach. Unpublished manuscript.
5. A lexical comment on a syntactic topic1 Kansai Gaidai University
1
Introduction
Topicalization2 in Japanese – a meeting point of syntax/semantics and pragmatics – is one of the most talked about subjects in Japanese linguistics3 which has been given few explicit formal accounts (for a few exceptions, see Shirai’s (1986) formal semantic account and Gunji’s (1987) syntactic approach). The phenomena in question consisting of two subtypes are exemplified in (1b, c) (cf. a nontopic counterpart (1a)).4 1
2
3
4
The work reported here grew out of a small conceptual kernel formed during the 1993 Head-Driven Phrase Structure Grammar Workshop held at the Ohio State University. I thank Takao Gunji and Hidetoshi Shirai for getting me started with the basic ideas. Thanks also go to two referees for their stimulating comments and criticisms and to Bob Levine for his conscientious editorial assistance. I would like to express my sincere appreciation to Ivan Sag, ΦΣΓ’s ringleader, whose generous support made it possible for me to take part in the workshop. I am solely responsible for any remaining errors. Though it is certainly relevant, the “contrastive” aspect of the marker -wa is not accounted for in this paper. Among extensive past work, we find pioneering syntactic/semantic studies such as Mikami (1960), Kuroda (1965), and Kuno (1973), to name just a few. More recently, we find some syntactic research in the GB framework (e.g. Saito 1985, 1986, Tateishi 1994). The GB treatments, however, seem to lack unity and are inconclusive due to the fact that the construction as a whole does not neatly fit into the way the framework divides up natural language syntactic phenomena (more on this below). The phenomena have also been investigated from a pragmatic/discourse point of view as many papers in Hinds et al. (1987) demonstrate. For an example like (1c) to be fully plausible, a proper context is needed in which Hanako, the topic, possesses a pragmatically detectable “aboutness” relationship (see note 6 below) to the rest of the sentence, for example, Taroo is a significant person for Hanako. A similar remark applies to other examples to follow.
200
(1)
a. Taroo -ga kaetteki-ta. - come.back- “Taroo returned.” b. Taroo -wa kaetteki-ta. - come.back- “As for Taroo, he returned.” c. Hanako -wa Taroo -ga kaetteki-ta. - - come.back- “As for Hanako, Taroo returned.”
The following two points are the main contentions of this paper. First, an explicit lexical account for topicalization is readily available within the framework of HPSG (Pollard and Sag (P&S hereafter) 1987, 1994). Further such an account is shown to be well motivated due to its ability to unite – under a set of common assumptions – the data that have been observed by previous researchers but treated as unrelated phenomena, and explain facts that have been hitherto unaccounted for (e.g. PP-topicalization introduced in section 2.3). Second, taking topicalization to be a lexically oriented phenomenon sheds light on the nature of constraints on the applications of lexical rules which are increasingly important theoretical apparatuses of HPSG but have been left generally unconstrained.
2
A lexical account of topicalization
2.1
Topicalization lexical rules
Following what is commonly accepted in the literature, topicalization is divided into two classes: the topic “substitution” type in (1b) and the topic “addition” type in (1c). Two topicalization lexical rules seen in (2a, b) are proposed in order to account for the former and latter types of topicalization, respectively.5 (2)
5
a.
x
verb + - 〈 . . . XP1 . . . 〉 y ... 1 ...
⇒
I use “- list” here as an equivalent of “ list” seen in Gunji (1987). I do not include and lists in the following AVMs. In general, the concatenation of these two lists equals an - list.
A lexical comment on a syntactic topic
201
x
verb + - 〈 . . . XPwa 1. . . 〉
b.
y ... 2 1 ... about | 1 2 ⇒ x verb + - 〈 . . . 〉 y ... ... x
verb + - 〈ΝPwa 1, . . . 〉 y ... [2] ... about | 1 2
In (2a) is the topic substitution rule which augments one of the - elements XP of the input verb with the topic marker -wa.6 It also adds a new pragmatic relation about to its . The XPwa now assumes double roles, namely being not only the original semantic argument of the verb but also the topic for which the psoa is a comment. No such augmentation of a marker takes place when the topic addition rule (2b) applies where a new - element NPwa is added. A similar change takes place in which specifies that the newly added topic element stands in about relation to the psoa in . 6
It is assumed that this augmentation is accompanied by a morphological process that removes the case markers -ga and -o obligatorily and -ni optionally. The morphological process keeps any other case markers/postpositions intact.
202
Let us see some examples. These rules operate on a verb like kaetteku “return” in (3a) and give rise to corresponding “topicalized” verbs in (3b, c) which are responsible for (1b, c) above, respectively. In (3b), the marker of NPga in the original - list is replaced with the topic marker -wa and in (3c) a topic NPwa is added to the list as an extra element. Both rules augment the original entry with the new pragmatic information (in |) which indicates that the contextual relation about holds between the topic and a comment (psoa), capturing the “aboutness” nature of the phenomenon.7 (3)
7
a.
kaetteku
b.
kaetteku
c.
kaetteku
verb + - 〈NPga 1〉 return 1 verb + - 〈NPwa 1〉 return 2 1 about | 1 2 verb + - 〈NPwa 3, NPga1〉 return 2 1 about | 3 2
The exact pragmatic properties of “aboutness” are left open for this paper. To this end, a formal approach along the lines of Yoon (1994) seems to be promising in capturing some aspects of the pragmatic properties in question. According to Yoon, aboutness is a lexicopragmatically determined relation called R- holding between an event and an individual. Typical R-s are seen in (ia) below. (Caveat: these are not thematic roles but what Yoon calls “familiar relations” which seems to be a superset of the set of thematic roles.) An NP with an aboutness relative clause in Korean in (ib) will be given an analysis like (ic). (i)
a. R- {, , , , , , , , , , , , , }
A lexical comment on a syntactic topic
2.2
203
Consequences: long-distance topicalization
As easily noted, one of the consequences of this analysis is that either the substituted or added topic becomes an argument included in the - list, predicting its syntactic behavior (for example extractability) to be on a par with other arguments. (For more discussion on this point, especially with respect to the topic addition rule, see section 2.4.) This turns out to be true with respect to long-distance topicalization.8 First, we consider the topic substitution type. (4a) shows that an embedded subject can be scrambled across an S boundary if the topic marker replaces the nominative marker. As has been noted in the literature (Saito 1985, Fukushima 1991, etc.), a nominative subject cannot be scrambled out of a minimal clause due to an independent factor. (The location of the “gap” is indicated by “e” for convenience.) (4)
a. Hanako 1-wa/*-ga uwasa -ga [e1 zibun-no kokyoo-de -/- rumor - self- hometown-in sinda] -to tutaetei-ru. died - report- “As for Hanako, the rumor reports [that she died in her hometown].” Taroo -ga kyoo-made [Ziroo -ga b. Ano hon1-wa/-o that book-/- - today-till - honya-de e1 kau] -to omottei-ta. bookstore-at buy - think- “As for that book, Taroo thought till today [that Ziroo would buy it at a bookstore].”
Long-distance scrambling of a direct object is possible when it is marked either with the topic or accusative marker as seen in (4b). (Indirect objects show the same pattern as direct objects.) Next let us examine the topic addition type. We recall that in this type of topicalization, the topic does not correspond to any semantic argument of the verb, making it necessary for the topic to be understood to have some pragmatic “aboutness” relationship to other elements in the sentence. It is interesting to note that, if suffixed with the topic marker, even adjuncts can be b. ai-ka wun UN soli baby- cry sound “(Lit.) the sound that a baby cries = the sound that characterizes a baby’s crying” c. λx[sound ′(x) ∧ ∃e∃y[[crying ′(e) ∧ (e, y) ∧ baby ′( y)] ∧ R-(e, x)]]
8
The final line indicates that there is a pragmatically determined R- (in this case ) that holds between a baby’s crying event e and an individual x, namely sound. The whole expression is a one-place predicate, i.e. a common noun. For terminological convenience, I sometimes refer to long-distance topicalization as longdistance “scrambling” but this should not be taken to mean that I consider the two identical. See Saito (1986) for the discussion of the differences between scrambling and topicalization. There is no pretense at all that the account presented here is intended to be complete for scrambling phenomena in general.
204
scrambled out of a minimal S as shown in (5a). The S-initial topic adjuncts in (5a) can be added to the - list of either the embedded or matrix verb by the topic addition rule. Long-distance scrambling of regular adjuncts is generally impossible, cf. (5b). (5)
a. Asu1-wa/ Ano honya1-wa Taroo -ga kyoo-made tomorrow-/ that bookstore- - today-till [Ziroo -ga hon-o e1 kau] -to omottei-ta. - book- buy - think- “As for tomorrow/that bookstore, Taroo thought till today [that Ziroo would buy a book then/there].” *Ano honya-de1 Taroo -ga kyoo-made b. *Asu1/ tomorrow/ that bookstore-at - today-till [Ziroo -ga hon-o e1 kau] -to omottei-ta. - book- buy - think- “(Int.) Taroo thought till today [that Ziroo would buy a book tomorrow/at that book store].”
Continuing with the second type, we find that a sentence like (1c) above can be embedded under another as in (6a) and the embedded topic can be scrambled to a higher clause as (6b) demonstrates. (6)
a. Uwasa -ga [Hanako -wa Taroo -ga tuini kaetteki -ta] rumor - - - at last return - -to tutaete -iru. - report - “The rumor reports [that, as for Hanako, Taroo returned at last].” b. Hanako1 -wa uwasa -ga [e1 Taroo -ga tuini kaetteki -ta] - rumor - - at last return - -to tutaete -iru. - report - “As for Hanako, the rumor reports [that Taroo returned at last].”
Though the examples above all involve a single level of S embedding, the topic–gap relation is basically unbounded (Gunji 1987) and is not, in general, supposed to be subject to island constraints (Kuno 1973). However, Saito (1986) challenges the latter assumption and divides topicalization into two (totally distinct) subtypes: PP- and nonPP-topicalization. The former is claimed to behave on a par with scrambling (hence is subject to various island constraints) and the latter involves a “base-generated” S-initial topic element (hence is not subject to island constraints). Though it is observationally correct, Saito’s division of the topic constructions into PP- and nonPP-types is arbitrary to the extent that he fails to derive the distinction from the assumptions available in the framework he assumes. In the next section it will be shown that NP- and PP-topicalization belong to a single phenomenon whose differences are a direct consequence of the current lexical assumptions.
A lexical comment on a syntactic topic
2.3
205
More consequences: relative clauses
2.3.1. Extraction from an NP and aboutness relatives The present proposal has consequences for the analyses of relative clauses of a particular type – those loaded with “aboutness.” The following discussion attempts to demonstrate the fact that the degree of freedom furnished by the topic addition rule is just about flexible enough to capture rather problematic behaviors of aboutness relatives. The - argumenthood of an added topic is not a direct concern here. Relative clauses with a regular flavor are seen in (7), demonstrating relativization on the subject, direct object, and locative adjunct, respectively. (7)
a. [NP [S Taroo -ga e1 kat-ta] hon1] - buy- book] “the book that Taroo bought” kat-ta] otoko1] b. [NP [S e1 hon-o book- buy- man] “the man that bought the book” e1 kat-ta] mise1] c. [NP [S Taroo -ga hon-o - book- buy- store] “the store where Taroo bought a book”
For these NPs, a regular method involving transmission and cancellation seems to work quite well. The ed elements are transmitted to the top of the relative clause, just as in other clauses, where they can be canceled by the head nouns. On the other hand, there are aboutness relatives like the one in (8) for which this regular method is simply ill-suited.9 The problem here is that there is no gap to which we can relate the head noun via . (8)
9
[NP [S Taroo -ga sinde -simat -ta] onna] - die - - woman] “(Lit.) the woman who Taroo died”
See Kuno’s (1973) pioneering work elucidating the unique properties of relative clauses in Japanese. For Kuno, all Ss containing “thematized” NPs can be relativized. According to his transformational account, the derivation of an NP like (ia) is accomplished via “identical NP deletion” and “theme NP deletion” from the deep structure source like (ib). (i)
hon] a. [NP [S Taroo -ga kai-ta] - write- book “the book that Taroo wrote” b. [NP [S Taroo-wa [S Taroo-ga kai-ta]] hon]
Obviously, analyses along these lines involving a powerful and problematic operation like deletion will not find any followers today. Nevertheless, his observations on relativization remain valid. The current account is an attempt to give them an explicit theoretical footing.
206
For this type, the present proposal offers a simple solution. Using the topic addition rule (3b), we can add a topic NPwa to the - list of sin “die” and that topic NP and bind it by the head noun onna, capturing a strong sense of aboutness for such relative clauses. A different sort of difficulty for the regular method of relative clause formation arises with respect to NPs like (9). (9)
e1 e2 kit-ta] yarikata2 -ga] [NP [S [NP [S Taroo -ga ki-o - tree- cut-] method-] fumei-na] nokogiri1] unclear-] saw] “(Lit.) the saw [which the method [which Taroo employed to cut a tree] is unclear]” 1
1
2
2
Under the regular strategy of relative clause formation, this example is taken to involve two adjunct gaps within S2, making the transmission of information (concerning the first adjunct gap corresponding to the outermost head noun nokogiri1 “saw”) necessary beyond the NP2 boundary. The trouble is that the transmission of beyond the NP2 boundary will be empirically problematic as far as Japanese goes. Unlike in English (10a), Japanese exhibits no instance of extraction from NPs (10b). (Hopelessly bad examples with “case marker stranding” are not given here, i.e. *[S . . . NP1 . . . [NP . . . e1-ga/o/ni/no/kara/etc. . . . ] . . . ].) (10) a. Who1 did you buy [NP a picture of e1]? b. *Taroo -no1 Hanako -ga [NP (sono) e1 syasin-o] kat-ta. - - that picture- see- “*Taroo’s1 Hanako bought [(that) e1 picture].” Faced with the data in (9) and (10b), if the regular -only strategy for relative clause formation is to be retained for relatives like (9), we will be forced to stipulate that extraction from an NP is possible only when the NP appears inside a relative clause – a very unsatisfactory result. It is more plausible to assume that extraction from an NP is uniformly impossible. How does the present proposal help? The NP in (9) can be generated in the following manner utilizing the topic addition lexical rule without supposing extraction from an NP. First, the addition of a topic NPwa to the - list of the transitive verb kir “cut” of S2 (i.e. resulting in -〈NPwa, NPga, NPo〉) takes place. Second, the added NPwa is ed and transmitted to the top node of S2 and gets canceled (ignoring the marking value) by the head noun of NP2 yarikata “method.” Third, another topic NPwa is added to the - list of the intransitive predicate fumei-na “unclear” of S1 (i.e. resulting in -〈NPwa, NPga〉). Fourth, this second topic gets ed, transmitted to the top node of the S1, and canceled (again, ignoring the marking value) by the head noun of NP1 nokogiri “saw.”
A lexical comment on a syntactic topic
207
Actually, for the most deeply embedded relative clause S2 in (10), it does not matter how the gap (e2) for the adjunct yarikata2 “method” gets initiated – either as a straightforward adverb gap or topic gap (as done here). However, to avoid evoking extraction from an NP, it is crucial that there is no gap (e1) for the head noun nokogiri1 “saw” introduced in S2 proper. Rather the adjunct corresponding to this head noun is introduced as the topic for the predicate fumeina “unclear” and ed on the S1 level (not on the S2 level). This means that relationship between the head noun nokogiri1 (the instrument for cutting the tree) and the verb kir “cut” in S2 has to be established indirectly and compositionally with the mediation of the predicate fumeina “unclear.” To the extent that the contextual aboutness relation between the head noun(s) and the modifying relative clause(s) is plausible, such an NP is acceptable. 2.3.2 PP-topicalization Let us examine one more consequence of the current approach regarding relative clauses. I mentioned above that topicalization is generally considered to be immune to island conditions. So, for example, we find the following blatant violation of the complex NP constraint (11a) (cf. the ungrammatical regular scrambling counterpart (11b)). sakka2-o] yoku (11) a. Ano hon-wa1 Taroo -ga [NP [S e2 e1 kai-ta] that book- - write- author- well sittei-ru. know- “As for that book, Taroo knows the author who wrote it well.” sakka2-o] b. *Ano hon-o1 Taroo -ga [NP [S e2 e1 kai-ta] that book- - write- author- yoku sittei-ru. well know- “*That book1, Taroo knows the author who wrote e1 well.” The pattern of acceptability seen here is a straightforward result according to the current proposal given the assumption that extraction from NP is prohibited. (11a) actually involves no extraction from an NP whatsoever. The topic is added to the matrix verb sir “know” by the topic addition lexical rule. In contrast, there is no way to derive (11b) without evoking extraction from NP. What is interesting is the fact that suffixing a given item with the topic marker is not a sufficient (though necessary) condition for escaping the effect of the island constraint. Thus Saito (1986) reports that long-distance topicalization involving PPs is more restricted. The only difference between (12a, b) below is that, in the unacceptable (b)-example, the topic is not simply an NP but a PP combined with the topic marker.
208
(12) a. Sendai -wa1 Taroo -ga [NP [S e2 e1 dekake-ta] hito2-o] yoku - - went- person- well sittei-ru. know- “As for Sendai, Taroo knows the person who went there well.” b. *Sendai -ni-wa1 Taroo -ga [NP [S e2 e1 dekake-ta] hito2-o] -to- - went- person- yoku sittei-ru. well know- “(Lit.) As for to Sendai, Taroo knows the person who went there well.” Though he indeed makes valid observations, Saito finds no way to assimilate the contrast under the GB assumptions and ends up dividing topicalization arbitrarily into two types that are either the result of base-generation (with no subjacency violation) or scrambling (with a subjacency violation). The division is not independently motivated at all and contrast remains mysterious to him. Tateishi (1994) suggests that there are two distinct -was. One is a D head of a DP and the other is a P head of a PP. The former can be employed for both “topic” and “contrastive” interpretations while the latter is used only for the “contrastive” usage. It is claimed that only “topic” -wa is exempt from the subjacency effect. According to Tateishi, -wa in a PPwa is a P and of the “contrastive” type, scrambling of which results in a subjacency violation. Tateishi’s distinction between “topic” vs. “contrastive” -was strikes me as a rephrasing of Saito’s base-generation vs. scrambling opposition in such a way that it merely describes the contrast. In addition, claiming that a PPwa is used only for contrast is simply incorrect empirically. The discourse in (13) is just fine with the PPwa in speaker B’s utterance receiving a topic reading. (13) A: Taroo -ga Sendai -ni dekake-ta. - -to went- “Taroo went to Sendai.’’ B: Soko-ni-wa Hanako -mo itta-koto-ga-aru. that.place-to- -also go.experience.exist- “As for to that place, Hanako has also been (there).” The contrast between NP and PP topicalization follows from the present account automatically. (12a) requires no further explanation – it is generated in the same way as (11a). The only way to obtain (12b) is the following. First, we apply the topic substitution rule to the verb dekaker “go” in the relative clause, changing the locative complement PPni to PPni-wa. We cannot use the topic addition rule here since all it can do is to add a new NPwa (not PPni-wa) to an - list. Then this PPni-wa will have to be ed and extracted out of the object NP. But, as we know, this is impossible, hence the
A lexical comment on a syntactic topic
209
ungrammaticality of (12b). It is noted that the extraction of such a PP from a sentential complement per se is possible as the example below indicates. it-ta. (14) Sendai -ni-wa1 Taroo -ga [S Hanako -ga e1 dekake-ta] -to -to- - - went- - say- “As for to Sendai, Taroo said [that Hanako went (there)].”
2.4
Topic as an ARG-S element
Some remarks are due concerning the status of a topic as a lexically substituted/ added - element. Topicalization has been divided into two subtypes: the topic substitution type and the topic addition type. The above lexical approach to the former seems to be unproblematic in that a “substituted topic” not only becomes a topic of a sentence but also inherits the semantic role assumed by one of the original - elements before substitution. In this way, a substituted topic is both a topic and a semantic argument simultaneously, accommodating a native-speaker intuition naturally. The story is more subtle for the topic addition type. Lexically adding an - element which does not correspond to a semantic argument of a predicate (verbs among others) may strike some as counterintuitive, but this, as I argue in the discussion below, is both conceptually and empirically justified. Also related to this aspect of the current proposal is clause-internal scrambling which is a potential trouble spot. Since an added topic is part of an - list, if nothing more is said, it is expected to behave on a par with other “true” arguments with respect to scrambling. In this subsection, we examine these matters. 2.4.1 Conceptual justification Theoretically speaking, a lexical operation similar to the present topic addition rule is by no means an exception within the context of HPSG. For example, Manning et al. (this volume) suppose that exactly the same sort of lexical operation is called for in providing an account for adverbial scope ambiguity within a single sentence in Japanese. They assume that there is a lexical process that adds an adverbial element to an - list. It is noted that an adverb thus added to an - list also lacks a semantic role. They further note that this adverbial addition is in the spirit of a type-raising rule of . In light of such considerations, the topic addition lexical rule proposed above begins to look less unintuitive. Incidentally, again in (e.g. Steedman 1988), a topic (in English) is treated as a type-raised functor category that is to combine with a sentence as an argument. 2.4.2 Empirical justification Past research (e.g. Wasow 1977, 1980, Roeper and Seigel 1978, etc.) offer criteria to determine if a given word/construction is lexically derived or
210
not. For example, Wasow (1977) proposes that: lexical rules (a) can change categories of words (syntactic rules cannot); (b) feed syntax, i.e. do not change structures; (c) perform “local” grammatical relation-changing operations; (d) can be quite idiosyncratic. Unfortunately, we cannot apply some of these criteria in a straightforward way (some are downright irrelevant – items (a) and (c)) to the cases of topicalization, since the effects of the topicalization lexical rules are observable only postlexically and, even then, other nonlexical factors (like pragmatic considerations for item (d)) can be employed to explain the data. In this way many of the topicalization facts noted above and below can arguably be consistent with a nonlexical account of the phenomenon in question. Though empirical data supporting the lexical status of topicalization is indeed hard to come by, there is nevertheless some syntactic evidence for it in the spirit of item (b) above. Let us note the following facts about coordination. In (15a, b), we see the topic substitution verb sinpai-da “be.concerned” (derived from a two-place predicate) and the topic addition verb byooki-da “be.sick” (derived from a one-place predicate), respectively. What is interesting is that these two verbs can be coordinated as seen in (15c). (For (15c), we have to be careful not to sneak in a “reason” reading for the first conjunct verb. The two readings – reason vs. conjunction – can be clarified by considering distinct truth conditions.) (15) a. Hanako -wa Taroo -ga sinpai-da. - - concern-- “As for Hanako, she worries about Taroo.” topic .sub -〈 NPga, NPga〉 ⇒ -〈NPwa, NPga〉 b. Hanako -wa Taroo -ga byooki-da. - - sick-- “As for Hanako, Taroo is sick.” -〈NPga〉 topic⇒.add -〈 NPwa, NPga〉 c. Hanako -wa Taroo -ga [TVP byooki-de (sosite) totemo - - sick- (and) very.much simpai-da]. worry-- “As for Hanako, Taroo is sick and (she) worries about (him) very much.” What is significant about (15c) is that this is an instance of TVP coordination (a null hypothesis). Coordination of TVPs would have been impossible, had it not been the case that the topic addition lexical rule had already applied to the first conjunct predicate byooki-da (as in (15b)), making it a two-place verb prior to syntax. Thanks to the topic addition rule, then, both TVP conjuncts in (15c) have -〈NPwa, NPga〉 which are conjoinable using the coordination schema of P&S (1994).
A lexical comment on a syntactic topic
211
It may be argued (albeit against the null hypothesis) that an alternative S-coordination analysis like (16) is possible here involving some sort of empty pronominals (like a pro). (16) [S Hanako1-wa Taroo2-ga byooki-de] (sosite) [S pro1 pro2 sinpai-da] This type of analysis is problematic in the following ways: pros in these examples are expected to behave as dictated by Principle B of Binding Theory of HPSG (or of GB for that matter) – they must be locally o-free (or can be coreferential with any NP outside of its governing category). But pros here indeed act as anaphors. In (16), pro1 and pro2 cannot refer to anybody but Hanako and Taroo, respectively. This means that there has to be a construction-specific stipulation that the binding property of a pro in a coordinate structure is different from a “regular” pro. Replacing a pro with a – a possibility in GB – will not only betray the original motivation of using a pro, which appears in governed positions, but also forces us to formulate a special condition of the Control Theory dictating how a is obligatorily controlled in a coordinated structure.10 It is emphasized, given the present assumptions, that (15c) is treated as a simple TVP coordination without appealing to any additional assumptions including problematic empty NPs of any sort. 2.4.3 Clause-internal scrambling As noted above, with respect to scrambling, both topic substitution and topic addition topicalization are expected to exhibit the same range of word order possibilities observed in regular nontopicalization sentences. (The range of scrambling considered here is clause internal.) This is due to the fact that both topic elements are treated equivalently to other - elements. However, topicalization diverges from other “regular” constructions and allows a more restricted range of ordering possibilities. Observe the contrast in (17) between a regular transitive sentence (17a, b) and a topic addition sentence (17c, d).11 (17) a. Hanako -ga Taroo -o kusugut-ta. - - tickle- “Hanako tickled Taroo.” 10
11
See Iida (1996) and Fukushima (to appear) for more detailed criticisms of employing empty NPs in coordination cases similar to the present one. In particular, the latter demonstrates that the classification of binding domains based on [± anaphoric, ± pronominal] features is incapable of capturing the behaviors of empty NPs in Japanese even for a basic set of data. Something like (17d) is possible when -wa is used contrastively as in (i). (i) Taroo -ga nattoo-wa kiriai-da - fermented.soy.beans- not.fond- “In contrast to other food, Taroo hates fermented soy beans.” As stated above, these data are not covered by the current proposal.
212
b. Taroo -o Hanako -ga kusugut-ta. - - tickle- “=(a)” c. Hanako -wa Taroo -ga kaetteki-ta. (=(1c)) - - come.back- “As for Hanako, Taroo returned.” d. *Taroo -ga Hanako -wa kaetteki-ta. - - come.back- “(Int.)=(c)” Would this fact from scrambling undermine the current supposition that a topic is indeed an - element? If we fail to find independent factors excluding examples like (17d), the answer is yes. However, I argue that there are indeed such independent reasons.12 First of all, there is a general (maybe universal) tendency for a topic to be placed at the beginning of a sentence, e.g. English topicalization examples in (18). (18) a. Mary gave the book to John. b. The book, Mary gave to John. c. *Mary, the book, gave to John. This word order restriction is not surprising from a processing point of view. A topic-comment relation is constructed in a way that a topic is presented (temporarily) prior to the presentation of a comment on that topic. The Japanese example (17d) violates this as does the English sentence in (18c). Second, it is instructive to consider the following contrast between nonscrambled and scrambled topic substitution sentences (19b) and (19c) both of which are possible continuations of the sentence in (19a). ((19c) should be read with no special suprasegmental effects, such as an emphasis on Taroo-o and a pause after it, since such an emphasis is not at all necessary for regular scrambling like (17b).) Compare this to the contrast introduced in (17) above. (19) a. Hanako -to Masako -to Taroo -ga karakai-atte-i-ta. -and -and - tease--- “Hanako, Masako, and Taroo were teasing each other.” b. Hanako -wa Taroo -o kusugut-ta. - - tickle- “As for Hanako, she tickled Taroo.” (topic reading) c. Taroo -o Hanako -wa kusugut-ta. - - tickle- “Hanako, as opposed to Masako, tickled Taroo.” (contrastive reading) ≠ “As for Hanako, she tickled Taroo.” (topic reading) 12
Simply stipulating a linear precedence statement like NPwa a NP is a way out of this. But a solution with independent motivation seems to be preferable.
A lexical comment on a syntactic topic
213
In contrast to (17d) above, (19c) is certainly a good sentence. However, this example does not seem to be interpretable as topicalization readily. Rather an insurmountable preference is given to a “contrastive” reading which is another effect of the marker -wa. We immediately notice that, with respect to topicality and semantic role assignment, the sentences in (19b,c) are in between (17a) and (17c) – Hanakowa in (19b) is both a topic as well as a semantic argument of the verb kusugutta. With this observation, a possible (functional) explanation for the contrast between the three groups above emerges: (17d) is bad since it violates the topic-first condition. In addition, due to its add-on status with no semantic role to play in the sentence, the added topic cannot be salvaged as an object of contrast by the contrastive strategy which presumably utilizes coherent semantic information (particularly semantic role assignment) available in the sentence. The NPwa in (19c) violates the topic-first condition as well but, due to its double role, it is still the semantic subject of the sentence for which a contrastive reading can be rendered. The topic-first constraint and contrastive strategy are irrelevant for (17a, b).
2.5
Summary
In this section, I have shown that the proposed lexical account for topicalization is not only possible but also well motivated in such a way that it is able to offer explanations for and unite, under a set of common assumptions, unrelated facts some of which are either problematic for other accounts or hitherto unexplained. Additionally, it has been demonstrated that some potential trouble spots – both conceptual and empirical – are only apparent and independent motivation/factors can be found to neutralize them.
3
Theoretical implications
Having established a lexical approach to topicalization and reviewed empirical consequences of the proposal, let us now consider theoretical implications of it. The relevant implications concern the nature of lexical rules and the ways they are constrained. A lexically oriented constraint on rules (e.g. ) relating one lexical item to another (albeit indirectly via ) can be found in GPSG (Gazdar, Klein, Pullum, and Sag 1985; see also Flickinger 1983). It is supposed that such rules can apply only to mentioning a lexical head with a feature specification. With the advent of HPSG, the range of linguistic information that a given lexical item carries has expanded dramatically to include not just syntactic information but also information belonging to semantic and pragmatic domains. Due to the theory’s emphasis on declarativeness and monotonicity of linguistic information, lexical rules can get their hands on virtually any aspects of lexical information. Given this, I believe that carefully considering
214
the properties of lexical rules is not only meaningful but also necessary within the context of HPSG. For one thing, lexical rules are evoked by HPSG researchers very frequently to accomplish significant linguistic generalizations. On the other hand, the HPSG literature seems to be short on investigating theoretical and empirical delimitation of such powerful devices.13 It is hoped that the discussion given here, though preliminary, serves as the point of departure for more detailed future research on the subject matter.
3.1
Lexical rule interaction
It is recalled that the two topicalization rules introduced above involve a substitution or addition of a topic in the - list. Since the output of the topicalization rules are still lexical items, it is reasonable to expect that other lexical rules are able to apply to the output. However, it turns out that, for example, the causative lexical rule (Fukushima 1992, Uda 1993, Manning et al. this volume; see also Gunji this volume) and the passive lexical rule (Pollard and Sag 1987, 1994, Fukushima 1990, Uda 1993) are able to apply only to the output of the substitution topicalization rule. The contrast between (20 –21) on the one hand and (22–23) on the other is a witness to this discrepancy. ((c)-examples indicate manipulations of the relevant - lists by the lexical rules.) For example, in (20a), Taroo-wa is a substituted topic corresponding to one of the original - elements of the input verb kaetteku “come back”. (20b) shows that the causative rule can apply to the verb of (20a) and add the causer Hanako-ga to its - list (see (20c)).14 Similar remarks apply to the passive example (21) (assuming permutation of - elements). (All topic NPs are placed at S-initial positions to avoid the topic-first constraint mentioned above.) (20) a. Taroo -wa kaetteki-ta. (=(1b)) - come.back- “As for Taroo, he returned.” b. Taroo -wa Hanako -ga kaetteko -sase-ta. - - come.back -- “As for Taroo, Hanako made him return.” causative c. -〈 NPwa〉 ⇒ -〈NPga, NPwa〉 [NPga = causer] (21) a. Taroo -wa Hanako -o but-ta. - - hit- “As for Taroo, he hit Hanako.”
13
14
An alternative to this sort of effort is to abandon lexical rules altogether in favor of other apparatus, such as the . Though it may be relevant, such an alternative is not discussed in this paper. The causee Taroo can be marked by -wa or -ni-wa.
A lexical comment on a syntactic topic
215
b. Taroo -ni-wa Hanako -ga but -are-ta. -- - hit -- “As for Taroo, Hanako was hit by him.” passive c. -〈 NPwa 1, NPo 2 〉 ⇒ -〈NPga 2, NPni-wa 1〉 [NPga 2 = original object] In contrast, (22a) involves a verb which is the output of the topic addition rule which added Hanako-wa to its - list as an extra element. As (22b) demonstrates, the causative rule cannot be applied to the verb in (22a) to add the causer Ziroo-ga (see (22c)). Likewise for (23). ((23b) is fine as an “indirect” passive sentence with the reading that Taroo is adversely affected by that action of Hanako, but this is not what is intended.) (22) a. Hanako -wa Taroo -ga kaetteki-ta. (=(1c)) - - come.back- “As for Hanako, Taroo returned.” b. *Hanako -wa Ziroo -ga Taroo -ga kaetteko -sase-ta. - - - come.back -- “(Int.) causative of (a)” causative
c. *-〈NPwa, NPga〉 ⇒ -〈 NPga1, NPwa, NPga〉 [NPga1 = causer] (23) a. Hanako -wa Taroo -ga kodomo-o but-ta. - - child- hit- “As for Hanako, Taroo hit (her) child.” b. *Hanako -ni-wa Taroo -ga kodomo-o but -are-ta -- - child- hit -- “(Int.) passive of (a)” c. *-〈NPwa1, NPga2 , NPo3〉 ⇒ -〈NPga2, NPo 3, NPni-wa1〉 passive
3.2
Constraints on lexical rules
Would the results above undermine the current lexical approach? The answer depends on what counts as a “proper” application of a lexical rule. If it is the case that there are no constraints on the applications of lexical rules, all the forms above are expected to be well-formed. However, such a situation is hard to imagine. Take, as an example, the passive lexical rule. The rule should not apply, for example, to English intransitive verbs like sleep: *It was slept by John, meaning John slept. Informally put, a constraint on the passive rule will be that there have to be at least two - elements or two semantic arguments for the input verb. Similarly, there ought to be some constraint discriminating (20 –21) from (22–23). Unlike the passive rule, however, such a constraint on the topicalization rules will have to refer to nonsyntactic/ semantic parts of the lexical entry of the input verb due to the fact that topicalization calls for crucial references to pragmatic factors.
216
I propose the following as a general constraint on rule applications not just for topicalization in Japanese.15 (24) A- N: An application of a lexical rule altering the order of the - list of the input predicate is not allowed, if the input’s - list contains an element (a “contaminant”) not corresponding to an argument of a semantic predicate (in ). Intuitively, (24) captures, among other things, the fact that grammatical relations changing rules (e.g. causative, passive, etc.) – that change the order of an - by either adding a new element or permuting the list – can only be applied to verbs retaining a “pure” - list that has not been contaminated with some element not corresponding to any semantic argument of the verb. Given this, it follows that, for example, the application of the causative rule to the verb of (20a) is allowed (licensing (20b)) since NPwa is an - element of the topic substitution verb corresponding to both semantic as well as pragmatic arguments. But the application of the same rule to the topic addition verb of (22a) will be impossible (hence not licensing (22b)). The latter case involves an NPwa (a contaminant) corresponding solely to the argument of the pragmatic predicate about in |. Restricting our attention to topicalization still, there are other predictions of (24). First, it should be noted that, as (25) indicates, the application of the topic addition rule (let alone the topic substitution rule) to the output of the causative rule per se is possible, since it does not violate the condition in question. (25) a. Ziroo -ga Taroo -ni kaetteko -sase-ta. - - come.back -- “Ziroo made Taroo return.” b. Hanako -wa Ziroo -ga Taroo -ni kaetteko -sase-ta. - - - come.back -- “As for Hanako, Ziroo made Taroo return.” topic . add c. -〈NPga, NPni〉 ⇒ -〈NPwa, NPga, NPni〉 [NPga = causer] Second, the topic addition (but not topic substitution) rule will have to apply after all the grammatical relation-changing rules have applied. Third, it also follows that there will be no “piling up” of added (but not substituted) topic NPs. These turn out to be true. The second point has already been demonstrated, 15
Whether or not such a constraint is derivable from other aspects of HPSG is a point left open in this paper. The constraint, then, is nothing more than a descriptive generalization at the moment. Also an English sentence like John is believed to be smart is not problematic. John is a semantic argument of the predicate smart (indirectly through structure sharing), though it is not a semantic argument of believe to which the passive lexical rule applies. True exceptions to the generalization are cases involving expletive arguments like there and it, for example, in there is believed to be a problem. However, one crucial difference between these and non-expletive instances is that the former is truly “meaningless” in that they are not arguments of any predicate, either semantic or pragmatic.
A lexical comment on a syntactic topic
217
for example, with the contrast between (20b) and (22b). The third point is shown to be the case by the data in (26). (26) a. Hanako -wa Taroo -ga kaetteki-ta. (=(20b)) - - come.back- “As for Hanako, Taroo returned.” b. *Ziroo -wa Hanako -wa Taroo -ga kaetteki-ta. - - - come.back- “*As for Ziroo, as for Hanako, Taroo returned.” c. Hanako -wa Taroo -wa kaetteki-ta. - - come.back- “As for Hanako, as for Taroo, he returned.” Ziroo-wa in (26b) is the second added topic whose addition, according to (24), is not possible, while Hanako-wa and Taroo-wa in (26c) are added and substituted topic NPs, respectively. In fact all the original - elements of an input verb can be changed into topics by the topic substitution rule as long as there is only one added topic (if any) – a possibility allowed by the present system. It should be noted that these results obtain not by ordering the lexical rules in question arbitrarily and extrinsically, but rather as a consequence of the general constraint (24). Contradicting the third prediction, however, one of the referees points out that piled up topics appear to be possible in some instances, e.g. (27). (27) a. Ano meekaa-wa seihin-wa nedan-ga taka-i. that manufacturer- product- price- high- “As for that manufacturer, as for its products, their prices are high.” b. Sono daigaku-wa gakutyoo-wa musuko-ga yuumeina that university- president- son- famous sakka-da. novelist-. “As for that university, as for its president, his son is a famous novelist.” These examples are indeed fine and are counterexamples to the account given above, if nothing more is said. A clue for reconciling (27) with the current proposal is found by a careful examination of the properties of the multiple-ga construction (see Kuno 1973 and Kuroda 1992 among others). We see that, for example, (27a) can have a multiple-ga counterpart seen in (28a), while (26b) can not – (28b) is hopeless. This is surprising since the two sentences are superficially identical. (28) a. Ano meekaa-ga seihin-ga nedan-ga taka-i. that manufacture- product- price- high- “Of that manufacture, of its products, their prices are high.” b. *Ziroo -ga Hanako -ga Taroo -ga kaetteki-ta. - - - come.back-
218
It is noted that the first two NPgas in (28a) are in a “modificational” relation16 to the Ns that immediately follow them. So the NPga-sequence in question can be understood as: [NP ano meekaa-no [N′ seihin-no [N′ nedan]]] “that manufacture’s product’s cost” (likewise for (27b)). But such a con-strual is impossible for (28b). Now, how would (28a) get generated? A possibility that is consistent with the spirit of the current approach is that there is a lexical rule, call it “subjectivization” (similar to the topic addition rule), that adds an NPga to the - list of the predicate takai “high,” observing the condition that the added NPga have to qualify as a modifier for the original - element, nedan “price,” of the predicate.17 Crucially, this rule is different from the (nonrepeatable) topic addition rule in that its semantic effect is not null, making it repeatable. So the original - list of takai -〈NPga 1〉 is altered to -〈NPga3, NPga 2, NPga 1〉 [NPga 1 = original subject]. If this much is feasible, getting (27a) is straightforward. We can apply the topic substitution rule and change -gas to -was for the first two NPs of (28a), i.e. . sub 〈NPga 3, NPga 2, NPga 1〉 topic ⇒ -〈NPwa3 , NPwa2, NPga 1〉. In contrast, the first two NPgas in (28b) do not qualify as modifiers in the required sense. Thus the contrast between (27a) and (26b) is attributed to (hence predictable from) the availability of corresponding multiple ga structures which in turn correlates with the existence/absence of possible modificational relations. Finally, let us examine another set of facts vindicating the need for the - restriction quite independently of topicalization. As is well known, there are two passives in Japanese: “direct” and “indirect” (or “adversity”) passives. The former is quite equivalent to the passives in English involving permutation of an - list and has already been introduced above. The latter, in the spirit of lexicalism, is a construction generated by a lexical rule that is in some sense similar to the topic addition rule in such a way that the added “adversity subject” is not assigned any semantic role by the verb to which the passive morpheme is added. We assume that the adversity subject stands in a pragmatic relation (expressed in | ) of being adversely-affected by some event expressed by the rest of the sentence (cf. Kuno 1973, Kuroda 1992, Oehrle and Nishio 1981). In 16
C.f., a celebrated example due to Kuno (1973) in (ia) which has a genitive counterpart (ib). (i) a. Bunmeikoku-ga dansei-ga heikinzyumyoo-ga mizika-i. civilized country- male- average life span- short- “It is in civilized countries where the average life span of males is short.” b. Bunmeikoku-no dansei-no heikinzyumyoo-ga mizika-i. civilized country- male- average life span- short- “The average life span of males in civilized countries is short.”
17
Unfortunately, the exact nature of this modificational relation is not fully understood but at least includes a possessive relation in addition to more traditional modifier–modifiee relations. According to Kuno (1973) these NPgas are generated by the rule of “subjectivization” that converts (iteratively) the left-most genitive-marked NP into a nominative marked one that is to become the new subject of the sentence.
A lexical comment on a syntactic topic
219
fact, unlike the cases involving topic addition where no morphological change takes place, this is a tangible instance of lexically adding an element to an - list accompanied by an explicit morphological change (see also the benefactive lexical rule seen below). We find examples of each in (29). (29) a. Hanako -ga Taroo -ni but -are-ta. - - hit -- “Hanako was hit by Taroo.” dir . passive -〈NPga 1, NPo2〉 ⇒ -〈NPga 2, NPni1〉 [NPga 2 = original object] b. Hanako -ga Taroo -ni iedes -are-ta. - - walk.out.on -- “(Lit.) Hanako was walked out on by Taroo (i.e. adversely affected by Taroo’s walking out on her).” ind . passive -〈NPga 1〉 ⇒ -〈NPga 2, NPni 1〉 [NPga 2 = added adversity subject] What is interesting is that the benefactive lexical rule suffixing -morau to a verb can be applied to the output of the direct passive rule in (29a), deriving (30a) in which we find an extra - element corresponding to a beneficiary.18 (This situation is similar to the indirect passive rule above.) In contrast, the same rule cannot be used to “benefactivize” the output of the indirect passive rule – (30b) is impossible. (30) a. Ziroo -ga Hanako -ni Taroo -ni but -arete-morat-ta. - - - hit -- “Ziroo received a favor from Hanako in which she was hit by Taroo.” benefactive -〈NPga 2, NPni1〉 ⇒ -〈NPga 3, NPni2, NPni1〉 [NPga 3 = beneficiary] b. *Ziroo -ga Hanako -ni Taroo -ni iedes -arete-morat-ta. - - - walk.out.on -- “(Lit.) Ziroo received a favor from Hanako in which she was walked out on by Taroo.” benefactive *-〈NPga 2, NPni2〉 ⇒ -〈NPga 1, NPni 2, NPni 1〉 Obviously, the - (24) allows (30a) but excludes (30b), due to the fact that, in the latter as assumed above, the adversity subject is solely an argument of a pragmatic predicate, hence a contaminant.
4
Concluding remarks
This paper started with the idea that topicalization in Japanese is to be accomplished by lexical rules and has examined both empirical and 18
There are different benefactive morphemes. For example, -moraw indicates the subject is a beneficiary while -age makes the subject a benefactor. We use only the former in the text. See Gunji (1987), Fukushima (1990), and Uda (1993) for more detailed discussion on the subject.
220
theoretical consequences of a lexical account of the phenomenon formulated in the framework of HPSG. As demonstrated above, the proposed account is not only motivated but also theoretically interesting. However, it is noted that the account presented here is not the only possibility found in the literature dealing with the phenomenon. Among other well-articulated treatments worthy of close scrutiny is Gunji’s (1987) syntactic account. According to Gunji, topic elements are taken to be (adverbial) adjunct PPs for some V-projections (inclusive of S) and constructed by combining an NP or PP with one of the several topic markers -wa with distinct lexically specified properties. (There is of course only one topic marker according to the current approach.) These lexical specifications determine whether a given topic PP adjoins to (a) a sentence with a gap (i.e. the topic substitution type), (b) a sentence without a gap (i.e. the topic addition type), or (c) a sentence with a reflexive element zibun. Though the current system and that of Gunji’s seem to offer an equivalent coverage of the facts for the most part, the two accounts indeed bring about distinct empirical consequences. For example, I do not immediately see how the differences between PP- and NP-topicalization with respect to relative clauses seen above can be captured in his system. However, instead of doing an extensive comparison between the two, I mention a few pieces of basic distributional data that suggest that the current account is more natural. We focus on the topic addition type. Since Gunji treats a topic as a V-projection modifier, it is supposed to exhibit adverbial behaviors. This expectation is not borne out. First, a topic cannot be coordinated with other adverbs. (31) a. [ADVP Kyoo sosite tasikani] Taroo -ga kaetteki-ta today and certainly - come.back- “Today and certainly Taroo came back.” b. *[ADVP Hanako -wa sosite tasikani] Taroo -ga kaetteki-ta - and certainly - come.back- “*As for Hanako and certainly Taroo came back.” If Hanako-wa in (31b) were a sentential modifier, it ought to be able to be conjoined with another such modifier like tasikani “certainly.” Second, modifiers for adverbs cannot be used to modify a topic. When the adverb kakuzituni “definitely” precedes another adverb like kyoo “today” as in (32a) (given that they are compatible) the first can be taken to modify either the second adverb alone or the entire sentence including the second adverb. Matters are different when an adverb precedes a topic as in (32b). The only possible construal of the adverb kakuzituni is for the entire sentence. (32) a. Kakuzituni kyoo Taroo -ga kaetteku-ru. definitely today - come.back- “[Definitely today, not tomorrow] Taroo comes back.” “Definitely [Taroo comes back today].”
A lexical comment on a syntactic topic
221
b. Kakuzituni Hanako -wa Taroo -ga kaetteku-ru. definitely Hanako - - come.back- “Definitely [as for Hanako, Taroo comes back today].” ≠ “[Definitely as for Hanako, not someone else] Taroo comes back.” The modifier analysis owes us some explanations as to why expected adverbial behaviors do not obtain. In contrast, all these indications of the nonadverbial nature of an added topic are automatic consequences of the present proposal according to which added topics are not sentential modifiers.
References Flickinger, D. 1983. Lexical heads and phrasal gaps. In Proceedings of the Second West Coast Conference on Formal Linguistics, ed. M. Barlow, D. Flickinger, and M. Wescote. 89–101. Fukushima, K. 1990. VP-embedding control structures in Japanese. In Grammatical Relations: A Cross Theoretical Perspective, ed. K. Dziwirek, P. Farrell, and E. MejíasBikandi. Stanford: Center for the Study of Language and Information. 163–182. Fukushima, K. 1991. Phrase structure grammar, montague semantics, and floating quantifiers in Japanese, Linguistics and Philosophy 14: 581– 628. Fukushima, K. 1992. The implications of Japanese honorification for the proper treatment of complex predicates. Presentation at 1992 Head-driven Phrase Structure Grammar Workshop, CSLI, Stanford University. Fukushima, K. to appear. Bound morphemes, coordination, and bracketing paradox. Journal of Linguistics. Gazdar, G., E. Klein, J. Pullum, and I. Sag. 1985. Generalized Phrase Structure Grammar. Cambridge, MA: Harvard University Press. Gunji, T. This volume. On lexicalist treatments of Japanese causatives. Gunji, T. 1987. Japanese phrase structure grammar. Dordrecht: Reidel. Hinds, J., S. Maynard, and S. Iwasaki. 1987. Perspectives on Topicalization: The Case of Japanese wa, Philadelphia: John Benjamins. Iida, M. 1996. Context and Binding in Japanese. Stanford: Center for the Study of Language and Information. Kuno, S. 1973. The Structure of the Japanese Language. Cambridge, MA: MIT Press. Kuroda, S.-Y. 1965. Generative Grammatical Studies in the Japanese Language. Ph.D. dissertation, MIT. Kuroda, S.-Y. 1992. Japanese Syntax and Semantics. Dordrecht: Kluwer. Manning, C., I. Sag, and M. Iida. This volume. The lexical integrity of Japanese causatives. Mikami, A. 1960. Zoo-wa Hana-ga Nagai (“As for an elephant, its trunk is long”). Tokyo: Kuroshio. Oehrle, R. and H. Nishio. 1981. Adversity. In Proceedings of the Arizona Conference on Japanese Linguistics, Coyote Papers: Working Papers in Linguistics from A → Z, vol. 2, ed. A. Farmer and C. Kitagawa. Department of Linguistics, University of Arizona, Tucson. 163–185. Pollard, C. and I. Sag. 1987. Information-based Syntax and Semantics I. Stanford: Center for the Study of Language and Information.
222
Pollard, C. and I. Sag. 1994. Head-Driven Phrase Structure Grammar. Chicago: University of Chicago Press and Stanford: Center for the Study of Language and Information. Roeper, T. and M. Seigel. 1978. A lexical transformation for verbal compounds. Linguistic Inquiry 9, 199–260. Saito, M. 1985. Some Asymmetries in Japanese and their Theoretical Implications. Ph.D. dissertation, MIT. Saito, M. 1986. Three notes on syntactic movement in Japanese. In Issues in Japanese Linguistics, ed. T. Imai and M. Saito. Dordrecht: Foris. 301–350. Shirai, K. 1986. Japanese noun-phrases and particles wa and ga. In Foundations of Pragmatics and Lexical Semantics, ed. J. Groenendijk, D. de Jongh, and M. Stokhof. Dordrecht: Foris. 63– 80. Steedman, M. 1988. Combinators and grammars. In Categorial Grammars and Natural Language Structures, ed. R. Oehrle, E. Bach, and D. Wheeler. Dordrecht: Reidel. 417–442. Tateishi, K. 1994. The Syntax of “Subjects”. Stanford: Center for the Study of Language and Information. Uda, C. 1993. Complex Predicates in Japanese: An Approach in Head-Driven Phrase Structure Grammar. Ph.D. dissertation, University of Victoria. Wasow, T. 1977. Transformations and the lexicon. In Formal Syntax, ed. P. Culicover, T. Wasow, and A. Akmajian. New York: Academic Press. 327–360. Wasow, T. 1980. Major and minor rules in lexical grammar. In Lexical Grammar, ed. T. Hoekstra, H. van der Hulst, and M. Moortgat. Dordrecht: Foris. 285–312. Yoon, J.-H. 1994. The semantics of relative clauses in Korean. In Japanese/Korean Linguistics, vol. 4, ed. N. Akatsuka. Stanford: Center for the Study of Language and Information. 413– 428.
6. Agreement and the syntax–morphology interface in HPSG UC Berkeley
1
Introduction
The question of agreement in natural language has received much attention in recent years (cf. for instance the introduction of heads and accompanying projections for agreement with subject (“AgrS”) and object (“AgrO”) in recent transformational work).1 Its importance in phrase-structure-based approaches to grammar is evidenced by the fact that a whole chapter is devoted to this topic in the latest monograph on HPSG (Pollard and Sag 1994). While there are probably few other places in the literature covering a similarly wide array of facts in such detail, questions remain about both the general approach and the specifics of some of Pollard and Sag’s (hereafter P&S) analyses. In particular, it will be shown below how their treatment of “hybrid agreement” faces a number of problems. As a consequence of the proposed revisions, we will see how advances into another area, currently underdeveloped in HPSG, can be given a new impetus, viz. the question of how morphology interacts with the system of syntactic and semantic knowledge. While some inroads into this issue have already been made in P&S (1987), with the object language being English, not many complications had to be dealt with. Here, we argue that the development of the syntax– morphology interface should instead be driven by the analysis of more complex morphological systems, thus providing the foundations, perhaps, for a more comprehensive theory. 1
This paper is based on, but in many parts different from, Kathol and Kasper (1993), and hence supersedes the latter. Thanks to all the people who over the years made comments on that earlier paper, in particular Tomal Erjavec, Tibor Kiss, Robert Levine, Paola Monachesi, Rosanne Pelletier, Susanne Riehemann, Ivan Sag, as well as audiences at Stanford (1992) and Ohio State University (1993), and in particular Robert Kasper and Carl Pollard. Needless to say this paper would contain fewer mistakes if I had incorporated all their suggestions.
224
2
Dowty and Jacobson’s semantic theory of agreement
In line with the general research program of “minimal syntax” (cf. also Dowty 1996) which tries to reduce the amount of information in syntactic descriptions strictly to those parts that are not predictable on semantic or pragmatic grounds, Dowty and Jacobson (1988) attempt to show how agreement too can be viewed as a semantic phenomenon. Whereas this is uncontroversial for instance in the case of agreement that involves “natural gender” for people or regular ways of referring to “plural” vs. “singular” objects, one apparent obstacle to the uniformly semantic view is languages that have “syntactic gender” distinctions, that is classification of all nouns, whether animate or not, into gender classes that do not obviously line up along semantic criteria. An example of such a language is French. In light of such cases, Dowty and Jacobson propose an alternative view of the relation between gender specifications and the things that are referred to in a particular language using expressions bearing those specifications. Instead of indicating that the object referred to directly has the property that is associated with a specific feature (say, being male in the case of masculine gender specification), gender can be regarded as the reflection of the semantic fact holding of an object “in the real world,” namely that it is referenced in a particular language by use of a word bearing a certain feature specification. In other words, “language itself is a part of the world” (Dowty and Jacobson 1998: 98). So, for example it is a fact that holds of chairs that one very common word for this kind of object in French, chaise, is marked as feminine. Markings on other elements in the sentence, e.g. predicative adjective, “respect” this fact, that is, they are only defined “for those objects with the property that the most salient common noun that would be chosen to refer to them in the present context of utterance has” (Dowty and Jacobson 1988: 98) that same agreement specification, i.e. feminine in the case of French belle. So only in (1a) is the subject an admissible argument of the adjective, but not in (1b):2 (1)
a. La chaise est belle. the chair- is beautiful- “The chair is beautiful.” b. *La chaise est beau. the chair- is beautiful-
One of Dowty and Jacobson’s main pieces of evidence for this semantically based view of agreement comes from the gender specifications on deictic pronouns in languages with “syntactic” gender. They point out that in referring to 2
The star is Dowty and Jacobson’s. According to common linguistic conventions, it would indicate that in (1b) (as well as in (2b) below) we have cases of ungrammatical utterances, as opposed to cases of semantic or pragmatic anomalies. It is not clear, however, if this interpretation was intended, and if so, it might not be an uncontroversial classification (although not much will hinge on this question).
Agreement and the syntax–morphology interface in HPSG
225
an object that is commonly referred to3 by expressions marked as feminine, it is inappropriate to use an arbitrary gender specification even in the absence of an overt linguistic antecedent. Hence, (2b) and similarly (2c) are inappropriate in a context where the speaker is talking about a contextually salient chair: (2)
a. Elle est belle. it- is beautiful- “It is beautiful.” b. *Il est belle. it- is beautiful- c. *Il est beau. it- is beautiful-
Dowty and Jacobson recognize that their approach seems to run into problems when the same objects can be referred to by means of expressions with different agreement markings within the same language. For instance, there are two words for “car” in German: Auto (neut) and Wagen (masc). However, it is not possible to arbitrarily switch the gender specifications on the expressions that are used to refer to an object of that kind: (3)
Ich kaufte ein neues Auto. *Er war teuer. I bought a- new- car-. It- was expensive.
To account for these and similar restrictions on the ways that objects can be referred to in discourse using the same agreement specifications, Dowty and Jacobson appeal to general pragmatic principles to the effect that a change in these specifications can only be interpreted as carrying an “implicature of disjoint reference” (Dowty and Jacobson 1988: 99). This means that the speaker is only justified in departing from the originally used markings to indicate an actually different object. Dowty and Jacobson’s purely semantically based account of agreement has been criticized in various places, in particular by P&S (1994) (cf. also Chierchia 1989). They list five shortcomings of Dowty and Jacobson’s approach (P&S 1994: 105), of which we will concentrate only on a few. First, they note that it is hard to see how purely pragmatic principles can satisfactorily account for the fact that a change of agreement markings is not possible even in cases where syntactic constraints unambiguously determine the antecedent for the agreeing element. A case in point is the interaction of syntax with the gender specification of animals in English. These can commonly be referred to either by the neuter pronoun or by a masculine or feminine pronoun that reflects the animal’s natural gender. Thus, in a situation where there is a male dog, both it and he are admissible pronouns. However, there are examples in which syntax allows but one possible choice of referent to start with. In those 3
By “commonly referred to,” we mean, more specifically, something like “referred to using that base-level word expression which is most appropriate in common utterance situations.”
226
instances, changing the gender specification would not have any pragmatic effect because the syntactic constraints do not admit any alternative antecedents. Therefore, on Dowty and Jacobson’s account, we should assume that a change in gender is in fact possible. This situation arises, for instance, with subject EQUI control verbs like try. However, as the examples in (4) show (P&S’s (33a–d)), the reflexive nevertheless has to agree in gender with the controller of the VP, that is the matrix subject, something that Dowty and Jacobson’s account cannot adequately capture: (4)
a. b. c. d.
That dog is so ferocious, it even tried to bite itself. That dog is so ferocious, he even tried to bite himself. *That dog is so ferocious, it even tried to bite himself. *That dog is so ferocious, he even tried to bite itself.
Another area in which, as P&S note, a uniform semantic approach has to fail is where one and the same linguistic token seems to be involved in two different agreement relations at the same time. For instance, in French, the verb agrees with the polite plural pronoun vous in number (namely plural), whereas the predicative adjective will show singular agreement if the addressee is singular. Furthermore, the predicative adjective also records gender distinctions. Cf. (5), uttered toward a female addressee:4 (5)
a. Vous êtes belle. you are- beautiful-. “You are beautiful.” b. *Vous êtes belles. you are- beautiful-. c. *Vous es belle. you are- beautiful-.
The only way out would be to say that the addressed person in that particular situation has the semantic property of being predicable by adjectives that show feminine agreement and this person also, at the same time, has the property of counting as a plural object for the purposes of subject–verb agreement. This would mean that not only is it a property of objects in the real world to be referenced by linguistic expressions bearing agreement 4
As Bob Kasper has pointed out to me, the force of this example may be weakened by the fact that both the singular and the plural feminine forms of all adjectives are pronounced exactly alike in Modern French so that the only place where there is a difference is in the orthography. A better example might therefore be one such as the following involving the adjective loyal (“loyal”), where the masculine singular and plural forms are phonologically distinct. Accordingly the intended referent is male: (i)
Vous êtes loyal/*loyaux. you are- loyal-./. “You are loyal.”
In the subsequent discussion, however, we will keep with P&S’s original example.
Agreement and the syntax–morphology interface in HPSG
227
markings of some kind, but it would also be a property of these objects to behave differently according to which linguistic structure contains an expression that relates to (for instance, predicates over) them. This view, however, no longer seems to have any connection to our intuitive understanding of the notion of “semantic property.” It looks as though the only reason we attribute this putatively semantic (not grammatical) property to an object is because of the need to get the grammatical facts right. But once we open this door, there seems little in principle to stop us from conceiving of other grammatical categories, such as case or declension class, in terms of semantics too. But if we claim that, for example, Latin populus (“people”) possesses the semantic property of referring to the kind of object which in Latin is denoted by a linguistic expression with certain paradigmatic properties that are commonly described in terms of “first declension,” not only would this appear to be completely counterintuitive, more importantly, it does not tell us why information such as declension class, unlike grammatical gender (e.g. the fact that populus is masculine), in principle cannot be made reference to for example in terms of the marking on an adjective in the same noun phrase.5
3
Pollard and Sag’s theory of agreement
In their own theory of agreement, Pollard and Sag (1994) propose instead a more differentiated approach to agreement which permits agreement relations among linguistic objects to hold at various levels simultaneously. The central notion in P&S’s treatment of agreement is that of index. Indices have an existence in two domains; as part of the value of the feature, they are part of the semantic contribution of nouns and have an important role, for instance, in determining the particular argument slots that noun phrases fill in the semantic representation of a clause. In this respect they can be thought of as the HPSG equivalents of discourse referents. On the other hand, in the same way that discourse representations can be viewed as extended syntactic representations that form the input for semantic interpretation, indices also contain information relevant for what is usually thought of as syntactic constraints, in particular for describing binding and agreement phenomena. As inhabitants of the semantic realm, there is a close connection between indices and “the real world”: they have to be anchored to actual objects in the context of utterance (or bound by quantifiers). The contraints on what kinds of objects an index with particular feature specifications can be anchored to is given by anchoring conditions. These are, in a sense, pragmatic wellformedness conditions on the proper usage of indices with particular features. 5
However, sometimes purely morphological features are available for agreement phenomena. As Arnold Zwicky has pointed out to me (p.c.; cf. also Zwicky 1987), anaphor–antecedent agreement in the Kru languages Vata and Krahn is sensitive to noun class distinctions that are phonological in origin and thus completely arbitrary from a semantic point of view.
228
For instance, P&S assume that a pronoun like I, whose index features are [ 1st, sg], is anchored to the speaker of the current utterance situation (cf. P&S 1994: 78):6 (6)
sign 〈I 〉 synsem local cat noun nom 〈 〉 ppro ref 1 1st sg { } |-| 1
Another way of looking at the function of indices is as elements that keep track of the particular way that something or someone gets individuated as a linguistic entity; thus, the features on indices are a record of what particular anchoring condition licenses the association of extralinguistic properties of a referent with the grammatical features in question. As we will see shortly, this record is syntactically important. To fully appreciate their conception of agreement, it is necessary to point out that P&S make a crucial distinction between those cooccurrence patterns in which a head with a particular morphological form selects an argument and those in which certain parts of two or more feature structures are actually structure-shared. In instances of the first kind, the morphological form of the selecting category is determined exclusively in terms of properties of its argument(s), while the selecting category itself does not contain an explicit representation of the respective feature markings. Consider for instance P&S’s feature specification for the pres, 3rd, sg form of the verb walk (1994: 82): (7)
〈walks〉
[ fin] 〈[]1〉 | walking 1 3rd sg
6
Unless indicated otherwise, we will assume that valence information is expressed by means of a single attribute called , rather than being distributed across and as in more recent versions of HPSG. Nothing hinges on that distinction.
Agreement and the syntax–morphology interface in HPSG
229
In this example, the verb agrees with its subject by way of the constraint that the selected subject be one whose specifications are [ 3rd, sg] – the verb itself does not record any of these properties in its own information. In P&S’s own words: “to be a third-singular verb is nothing more than to assign third-singular agreement on the index associated with one’s subject” (1994: 84). The way that this particular selectional property is correlated with the 3rd, sg morphology is by means of a lexical rule that maps base forms into fully inflected finite forms whose morphology is determined by a function taking the stem and idiosyncratic information as its arguments (cf. P&S 1987: 213):7 (8)
Third-Singular Lexical Rule base 1 3 2 | 3 4
3rdsng f3RDSNG (1,2) ⇒ | 3 4
It is then by means of the sort hierarchy that verbal signs of sort 3rdsng will inherit the constraint that the index on their subject be specified as 3rd, sg, thus establishing the association between morphological and selectional properties: (9)
3rdsng ||
[ fin]
[] . . .| 3rd sg
Technically, there is little difference in P&S’s theory between the case where a verb selects a subject with a particular case marking (like ) and the one in which that verb selects a 3rd, sg subject – because there is no structure sharing in either case, we can regard the distribution of the features in question essentially as in terms of government (cf. for instance Zwicky (1986b) on the notion of feature government). Moreover, since the triggering factor, i.e., the feature specification on the index, is intimately associated with semantic properties of the subject in question, it does not seem too far-fetched to understand P&S’s position as treating agreement as a form of “selectional restriction.” This is particularly so as other, more common examples of selectional restriction are also modeled via index properties, such as the requirement by verbs like rain to cooccur with subjects that have an index of sort it.8 7
8
Here, we have adjusted the feature architecture of the original formulation of the lexical rule to reflect the more current assumptions about syntactic and semantic information forming a unit that is grouped under in place of two separate features and as in older versions of the HPSG theory. As Carl Pollard (p.c.) has pointed out to me, however, there still is a difference between index selection and genuine sortal restrictions on selected categories in that the latter case crucially involves constraints on the value of the attribute, rather than itself.
230
Agreement in terms of selectional restrictions involving some expression’s index specification (which P&S refer to as “index agreement”) is not the only way that an agreement relation can manifest itself. To see this, consider again our example of before in (5) involving what P&S call “hybrid agreement.” According to P&S’s account, in such sentences, only the agreement of the verb with the plural subject vous is mediated via index agreement. The predicate belle, being marked singular on the other hand, cannot be in an agreement relation with the subject’s index as the latter is specified as plural. Instead, P&S assume that the marking on the predicate reflects a constraint to the effect that the index of the subject – regardless of its own features – be anchored to a nonaggregate entity, thus directly reflecting a nonlinguistic property of the subject’s referent.9 The interaction of index agreement and, as the authors call it, “pragmatic agreement” or, as we will refer to it here, “anchoring condition agreement,” is schematically illustrated in (10): (10)
index agreement () index agreement (, )
〈vous〉 fem 2nd pl . . nonaggregate
êtes
belle
pragmatic agreement () The third type of agreement P&S recognize does involve full-fledged structure sharing between features on “selector” and selected category10 and is exhibited by phenomena such as NP-internal agreement in case and – as we will see later – declension class in German. This type of agreement is what P&S refer to as “concord” or, a little vaguely perhaps, “syntactic agreement.” An illustration is given in (11) (P&S 1994: 92):
9
Agreement of the predicate with the subject in terms of gender is still mediated via the subject’s index. Thus, the plural form belles requires an aggregate of objects referred to by a noun with a feminine-marked index, and not, as P&S state, an “aggregate (of females)” (1994: 103), which would incorrectly rule out sentences like the following with a nonanimate subject: (i) Les chaises sont belles. the chairs-. are- beautiful-.
10
By “selector category,” we mean categories that select other categories, via valence features such as , , or (in varieties such as the one adopted here that assume a single valence list) as well as other selection features like , , etc.
Agreement and the syntax–morphology interface in HPSG
(11)
231
〈mädchen〉
| 2¬ gen 〈 [ 2]1〉 3rd | 1 neut sg girl 1
Here, the noun’s value is represented as part of its specification and, moreover, it is token-identical with the case value of the determiner selected by the noun via the feature. The way the case specification on a noun interacts with that of an attributive adjective is best illustrated by the relevant parts of the entry for the adjective kluge (“smart”) as given in (12) (P&S 1994: 93): (12)
adjective noun nom ∨ acc 〈[strong]〉 | | neut sg 〈 〉
The reason then, according to P&S, why, in a phrase like das kluge Mädchen, the adjective occurs in its weak form while the determiner is of type strong, is that the adjective selects via the feature a noun which in turn via its feature selects a strong determiner.
3.1
Problems with Pollard and Sag’s account
There are a number of conceptual difficulties with this account. First of all, note that with this conception of agreement, except in instances of case concord (or pronominal agreement, for that matter), there is no sense in which a selector category can be specified for particular features other than in terms of the ones it requires to be on the index of the specific argument (e.g. the subject) that it is taken to be in agreement with. This goes against the commonly held assumption (cf. Barlow and Ferguson 1988) that agreement involves some sort of marking on both the argument (usually the “source”) and the selector (usually the “target”), which distinguishes agreement from government. It has been argued (for instance in Zwicky 1986a, 1986b) that the distinction between feature government and agreement should be maintained.
232
Yet, as we already pointed out, except for case concord, P&S treat agreement essentially as government. Next, this account cannot explain why there often is a close morphological relationship between the form of the selector category and the category selected. For instance, in Latin adjectives and nouns have almost the same range of declension classes so that one can easily find examples such as the following where the morphological markings on both noun and adjective coincide (cf. Lehmann 1988): (13) illarum duarum bonarum feminarum those-.. two-.. good-.. women-.. “of those two good women” If only case concord involves strict structure sharing of information, while the distribution of gender and number marking is done via selectional restriction as P&S suggest, the exact correspondence in form between selecting (adjective) and selected category (noun) would be somewhat surprising. The same holds for cases of subject–verb agreement where both subject and verb exhibit the same morphological expression. While less common than the NP-internal case, examples can nevertheless be found fairly easily, such as agreement with the nonanimate nouns of class 7/8 in Swahili (Corbett 1991: 43): (14) a. Kikapu kikubwa kimoja kilianguka. basket large one fell “One large basket fell.” b. Vikapu vikubwa vitatu vilianguka. baskets large three fell “Three large baskets fell.” Both the noun and the verb show prefixation of ki- and vi- for singular and plural forms, respectively. Again, if the verb’s own morphological markings are only indirectly linked to the morphosyntactic properties of the subject as in P&S’s system, this convergence of form is unexpected. Furthermore, there is no single place in the grammar to express the fact that for languages like English or German, the only features that are involved in subject–verb agreement are and , or for that matter, that is not a feature which ever plays a role in NP-internal agreement.11 On the other hand, , although one of the index features, never plays a role there, while in Russian, verbal agreement with the subject is indeed sensitive to gender distinctions, albeit only for the past tense forms. It seems desirable to be able to make a general statement about what features are available for what kinds of agreement relationships. A similar point is made by Zwicky (1987: 8–9), who points out that one of the differences between what he calls 11
As far as we are aware, this point was first raised by Tibor Kiss.
Agreement and the syntax–morphology interface in HPSG
233
“anaphor agreement” vs. “local agreement” may be that the former can make reference to an essentially open-ended class of mostly semantically based properties, whereas the set of categories known to participate in local agreement is comparatively limited. Finally, it is not quite clear how to extend P&S’s system to those cases in which there is no (overt) source for the marking on the verb. Impersonal passives in German are an example, showing invariably third person singular marking: (15) An jenem Abend wurde viel gelacht. during that evening was-3. much laughed “There was much laughter that evening.” Short of assuming that there is indeed a phonologically unexpressed, but fully informative empty subject element involved here, for which no theoryexternal evidence can be found, one is faced with the problem that there is no direct relation between the morphology of the verb and any argument index, contrary to P&S’s conception that verbal morphology always exhibits something about the subject’s index. If, on the other hand, one concedes that the verb bears agreement features in its own right here, the question arises whether it should not be possible to say the same of all other cases of verbs bearing inflectional markings too.12 So far, we have only encountered what could be described as conceptual problems with the P&S theory of agreement. However, there are also technical difficulties which we want to focus on now. First, consider the following example from Spanish of what Corbett (1991: 225) calls “hybrid nouns”: (16) Su Majestad suprema está contento. his majesty surpreme- is happy- “Your Majesty is happy.” Here, a male is referred to by a feminine form majestad which results in the attributive adjective bearing feminine marking. The predicative adjective, on the other hand, is marked masculine. Now, according to the P&S theory of agreement, we can account for this mismatch by assuming that the attributive adjective agrees with the noun in terms of index agreement whereas anchoring conditions constrain the gender marking on the predicative adjective to the effect that the index must be anchored to a male individual. The problem with 12
Balari (1992: 48) also makes the observation that the following two sentences are “utterly ungrammatical,” whereas on P&S’s account, they would merely be treated in terms of a pragmatic anomaly which, according to Balari, does not seem to correctly reflect the severity and nature of the deviance. (i) a. *Tu es belles. you- is- beautiful- b. *Elles sont belle. they- are- beautiful-
234
this view, however, is that it forces us to assume that anchoring conditions of this kind not only affect animate nouns, but inanimate nouns as well, as the latter also trigger agreement on the predicative adjective, as in: (17) El coche es rojo. the car- is red- To accommodate this case, short of having a disjunction of different kinds of constraints, we would then have to say something that comes perilously close to Dowty and Jacobson’s view, namely that for predicative adjectives in -o, the index must be anchored to something in the world for which the common way of reference in Spanish is by means of a noun with masculine gender specifications. But this is precisely the kind of artificiality that P&S try to avoid by adopting indices as a means of mediation between linguistic expression and the real world. Another challenge to P&S’s view of agreement comes from data on agreement in Italian si-constructions, discussed in Monachesi (1993). Monachesi observes that in examples like the following, the verb shows singular morphology, while the predicative adjective as well as the reflexive se stessi exhibit plural marking. (18) Si è orgliosi di se stessi. is-3. proud-. of one selves-. “One is proud of oneself.” If we try to apply P&S’s distinction between index and anchoring condition agreement to cases like these, we run into difficulties. To see this, note that according to P&S’s account, we would have to say that the verb agrees with the subject via index agreement while anchoring conditions account for the marking on the predicative adjective. But then – contrary to the HPSG binding theory (cf. Pollard and Sag 1992, 1994) – the plural marking on the reflexive stessi cannot come about via structure sharing with the index of a less oblique argument of the predicate orgliosi because, as the form of the copula è has already shown, the value on that index has to be singular.
4
An alternative proposal
In this section we want to develop an alternative to P&S’s theory of agreement that builds on their main insights but at the same time attempts to address the problems pointed out earlier in a more satisfactory way. We start out by proposing a revision to P&S’s treatment of agreement so that selector categories contain all the information that covaries with that specified on the selected category. In effect, this proposal treats agreement more explicitly as a phenomenon that involves merging of information contributed by various sources in the sentence than P&S’s treatment does. Schematically the difference between the two approaches can be depicted
Agreement and the syntax–morphology interface in HPSG
235
as follows where “/” is to be read as indicating the valence on the selector category, similar in spirit to the function–argument structure as known from categorial grammar: (19) a.
x
x/ y[F: 1] y[F: 1] b. x[F: 1] x[F: 1]/ y[F: ! ] y[F: ! ] Here, “F” is to be taken as any agreement feature.13 In (19a), which reflects P&S’s theory, agreement can be viewed as a compatibility check between what the selector requires the value on its argument to be and the actual instantiation on the argument. The object that results from this unification is not recorded on the mother category; but as part of the index, this information is kept in the semantic representation. In (19b) on the other hand, the selector category records both its own morphology and that of the selected category as 1 and !. In typical instances of agreement – and the ones we will be primarily concerned with here – the two tags will be identical, resulting in explicit structure sharing within the selector category. Furthermore, since agreement information is encoded as part of the specification, the value of the combination is also recorded on the mother category. Although this conception might at first blush appear somewhat redundant, it eventually does allow a more systematic and more flexible account of agreement in general. For one thing, it allows us to define abstract types with respect to agreement features. This means that we can make explicit what kinds of features are involved for what categories. As a consequence, we will be able to state the interface between syntax and morphology – i.e., what elements a particular item is in agreement with and how this is reflected morphologically – in a very general and uniform fashion. Second, we can state general patterns of covariation of agreement features. Normally this covariation will involve structure sharing of the same kinds of features, although their “source” might differ (morphosyntax vs. index). Thus, we can make general statements of the form “in English, subject and verb generally agree in number and person” despite the fact that this agreement might not always be reflected morphologically as, for instance, in the case of modals.14 Such statements are 13
14
Actually, in the case of (19a), it can also be a purely semantic condition on the object that the index is anchored to according to P&S’s anchoring-condition-based agreement. Of course, it is also possible to say that modals embody a separate sort of agreement altogether. The distinction between treating the morphological expression of the agreement feature as neutralized and assuming an entirely different agreement pattern is not always clear. Here, we want to opt for the first option. The same point will come up again below in connection with declension-class features in German.
236
not possible in P&S’s system. Moreover, P&S have to make reference to a heterogeneous set of constraints on lexical items: those involving syntactic feature values like masc and semantic anchoring conditions involving, e.g., maleness. As will become clear shortly, on the conception developed here, the treatment of (local) agreement can rely on feature specifications of the same kind (albeit given at different places in the feature structure). Technically, we are going to cluster all the information that is involved in agreement patterns under a feature called , which may be seen as parallel to the homophonous node in recent GB analyses for essentially similar kinds of information.15 For a verb like walks, we will then have the following, preliminary modified lexical entry: (20) morph-complex PF (4,5) = 〈walks〉 stem 4〈walk-〉 verb 7 | 6 finite | 5 1 3rd 2 sg 7 〈[]3〉 | walking 1 6 3 2 A number of comments are in order about this entry. First of all, note that the entry makes explicit that only person and number are involved in subject– verb agreement in English. This is so because only the features and are features appropriate for the subsort finite of .16 The value of these 15
16
In this study, we will not be concerned about object agreement, but it should be clear that just as such phenomena are often treated in terms of a separate functional head cum projections (“AgrO”), can also be modified to accommodate more complicated agreement patterns. The description in (20) makes explicit the assumption mentioned earlier, that agreement information is part of the specification. As Ivan Sag (p.c.) points out, this entails via the Head Feature Principle that whole sentences will also carry agreement information such as 3rd sg or 1st pl, which may lead to unwelcome consequences for instance regarding the analysis of coordination – given the assumption that head information must match across conjuncts; see P&S (1994: 202–204) for some discussion on this point. Alternatively, one may want to pursue the idea that agreement information is not part of verbal s after all and does not percolate to higher projections. Yet, it is not quite clear whether the same will carry over to NP-internal agreement, discussed below, as it is a widespread assumption that the NP as a whole, not just its head, bears agreement information, which in turn is accessible to selecting verbs.
Agreement and the syntax–morphology interface in HPSG
237
features is structure-shared with the value of these same features on the index of the verb’s subject. Furthermore, we are going to assume that is part of a larger collection of linguistic information, given here as the value of . This attribute clusters all the morphosyntactic information relevant for morphological spell-out, which includes, but is not restricted to, agreement information. For instance, distinctions in tense obviously bear on the morphological form of a verb, yet they never participate in agreement relations with other syntactic items. The phonology of the lexical item is then obtained by means of a function that takes information as one of its input parameters. More specifically, we take this function to be an instance of realizational frameworks such as Stump’s “Paradigm Function Morphology” (cf., for instance, Stump 1991, 1992). There, morphosyntactic features are realized in terms of paradigm functions which in turn are broken down into “morpholexical functions.” Note that there is an immediate correspondence between Stump’s schema for paradigm functions, stated in (21), and the way morphology is realized in (20) above. (21) (cf. Stump 1991) PF[σ] (x) = y, [σ] = the complete and fully specified matrix of morphosyntactic features associated with y x = the root of the paradigm y = a member of x’s paradigm Thus, what serves to parameterize paradigm functions in Stump’s theory, namely a set of morphosyntactic feature value pairs σ, corresponds to one of the arguments of the paradigm function PF,17 while the features involved, modulo their internal organization, are identical. The other argument of the PF function, Stump’s “root of the paradigm x,” is structure-shared with the value embedded in a new feature, , representing, in effect, morphological structure in terms of the steps in the (possibly recursive) formation process of particular form–meaning pairs from more basic form–meaning pairs.18 In the case of verbal inflection, the value of this feature contains all the relevant morphological information, i.e. the form of the stem that this particular form is related to as part of a paradigm,
17
18
Note that the relationship between the feature specification in , the stem information, and the final output of the morphological process could also be viewed in terms of a relational, rather than functional constraint. For the purposes of this study, however, the difference between the two will not be of significance and our preference for the functional view is mainly due to expository ease. Similar proposals for representing morphological structure, yet each with somewhat different overall feature architectures in mind, have been made in Krieger and Nerbonne (1992), Krieger (1994), Riehemann (1993), and Kim (1993).
238
as well as syntactico-semantic information associated with that stem which has a bearing on the exact feature specification on the fully inflected word in question. As a result, the top-level distinctions among the appropriate features for signs and their subsorts will be as in (22), which incorporates a separation of what in P&S used to be the sort word into morphologically complex words (morph-complex) for which the new is appropriate, and morphologically simple signs: roots and noninflectable words: sign
(22)
phrase
morph-complex
morph-simple root
noninflectible
It might be illustrative to briefly compare the approach to word formation taken here to the one taken in Kim (1993) (cf. also Kim 1994). The scope of that study is verbal derivation in Korean. The main difference to our approach is that Kim assumes that a word contains a full-fledged representation of its morphological structure, similar to that of phrases, i.e. in terms of a attribute, which in turn has as its values and information. The general schema for putting the morphology of stem and affix together is given in (23): (23) word
1 + 2 | 1 | 2 | . . .
Thus, while Kim does not assume that affixes are “first-class” citizens of the lexicon in the sense that they do not represent lexical entries, they nevertheless contribute information to the resulting form in a strictly concatenative way, just as words combine with other words and phrases in the syntax by way of concatenation. In general, affixes not only contribute morphology to the whole form, but – especially in derivational word formation – encode “instructions” about how to embed the information provided by the stem into the resulting form, including changes and further enrichments. Kim’s word formation schema for passives will serve as an example:
Agreement and the syntax–morphology interface in HPSG
239
(24) . . . 〈1[], 2[], . . . 〉 3 | {i, hi, li, ki} . . . 〈2[], (1[] ), . . . 〉 3 |
|
While Kim’s treatment of morphology can be seen as half-way between an “item-and-arrangement,” “word-syntax” based approach19 and an “item-andprocess” based one, the view espoused here squarely comes down on the side of latter one. In a sense, our approach can be seen as a natural continuation of Kim’s work in that it tries to completely do away with vestiges of ontological and morphological autonomy of affixal material, so the latter only “live” in the morphological expression given to them by the paradigm function and no special record is kept of nonstem information within the morphological information of a word-sign.20 We are going to assume that agreement information is not only recorded as the value for in verbs, but also for all other major categories. In particular, nouns also will have such information. In the next section, we hope to show how this revision not only leads to a conceptually cleaner theory of agreement, but also helps to develop what we want to regard as a more satisfactory treatment of multi-layer agreement phenomena such as the cases of hybrid agreement in Romance languages.
4.1
Applications
We use hybrid agreement in French as our first example to contrast our treatment with that of P&S. Instead of saying that in a sentence like (5a), repeated below, it is the index of the subject that accounts for the plural marking on the verb, we can instead say that for subject–verb agreement in French, the number feature is morphosyntactic, hence given by . (5a) Vous êtes belle. you are- beautiful-. “You are beautiful.” The predicative adjective, on the other hand, does agree with the agreement features on the index, which in French directly reflect the anchoring condition for the index, i.e. that if a linguistic expression refers to a female individual, the index contributed by that expression will in general be marked in its feature as fem.
19 20
For clear examples in HPSG-based literature, cf. Krieger and Nerbonne (1992), Krieger (1994). Cf. Erjavec (1995) for a more detailed elaboration of realizational morphology in HPSG.
240
As a result, we get a somewhat different multilayer agreement pattern than in P&S’s theory illustrated in (10), in that the division of labor in the treatment of hybrid agreement has been shifted to different components in the theory. (25)
morpho-syntactic agreement () 〈belle〉 sg fem
〈êtes〉 2nd pl
〈vous〉 pl sg fem 2nd
index agreement () index agreement (, ) One caveat to note, however, is that agreement of the finite verb with the subject in person continues to be handled via reference to the subject’s index specification. In fact, we assume that noun phrases in general do not have a attribute as part of their specification. This correctly removes as a candidate dimension of variation for NP-internal agreement phenomena. This in turn also means, for instance, that the pronoun expressing the polite plural in German, Sie, will have a [ 3rd ] specification on the index. The treatment of French agreement mismatches in (25) carries over rather directly to the Spanish example in (16), repeated here below, which we showed to be problematic for P&S’s theory. (16) Su Majestad suprema está contento. his majesty surpreme- is happy- “Your Majesty is happy.” All we need to say is that NP-internally, agreement with the head noun (e.g. as exhibited by attributive adjectives) is captured in terms of structure sharing with the noun’s specification – which in the case of majestad has fem as the value for – while the predicative adjective agrees in terms of the information recorded in the noun’s value, i.e. masc. Note that the latter is the morphosyntactic reflection (on the index) of an anchoring condition, hence is only indirectly associated with the semantic property of being male. As a result, all predicative adjectives can be given a uniform and straightforward treatment in terms of their agreement properties. Finally, the Italian data no longer present a problem either if, again, we make the assumption that subject–verb agreement is a reflection of the subject’s morphosyntactic specifications encoded under , while the predicate agrees with the corresponding properties. In section 5.1.2, we will propose an alternative analysis of the Italian case that treats sentences such as (25) as true impersonal constructions, i.e. without an overt syntactic entity that the verb could agree with; hence the inflection is taken to be due to a general agreement pattern that applies when no subject is present.
Agreement and the syntax–morphology interface in HPSG
241
In general, we can schematically represent the two forms of agreement we want to distinguish as follows:21 (26) a. morphosyntactic: (selector) ≈ (arg) b. semantic: (selector) ≈ (arg) This approach naturally lends itself to the treatment of other kinds of mismatches, for which P&S would have to invoke the difference between indices and anchoring conditions, as before. A particularly interesting example is the following sentence from Russian (from Corbett 1988): (27) Novyj vrah skazala. new- doctor said- “The new (female) doctor spoke.” The noun vrai is morphosyntactically masculine which in the sentence manifests itself by masculine marking on the attributive adjective despite the fact that the noun has a female referent. This is an example of what Corbett (1991) calls “agreement ad formam” (or: “syntactic agreement”). However, the agreement on the verb, which inflects for gender in the past tense, shows feminine marking, i.e. exhibits, in Corbett’s terms, “agreement ad sensum” (also: “semantic agreement”) with the natural gender of the subject. This situation is easily accommodated into our framework if we regard the morphology of the attibutive adjective as a manifestation of agreement, while the verb spells out some of the subject’s features. However, as Corbett observes, judgments are not always entirely clear-cut. Thus, while a majority of speakers accept (28a), there are some speakers who seem to prefer the feminine marking on the attributive adjective in (28b) (cf. Corbett 1991: 231 and 1983: 30–39): (28) a. Ivanova – xoroj-ij vrah. Ivanova good- doctor b. Ivanova – xoroj-aja vrah. Ivanova good- doctor “Ivanova is a good doctor.” What this dialectal variation appears to indicate given the assumptions so far is that for some speakers, attributive adjectives too can (or, as it seems, must) agree with the head noun in terms of information, while in the majority dialect, the information relevant is that given by the noun’s specification. Corbett notes that agreement on the verb in examples such as (29) is also subject to variation in these cases, but here, the split between agreement ad sensum and agreement ad formam is more even. 21
Here, “≈” stands for something like “is structure-shared in its relevant parts with.” As we will show later, the entire values of or are generally not structure-shared because might contain additional information not relevant for the particular agreement phenomenon at hand.
242
(29) a. Vrah prijel-ø. doctor came- b. Vrah prijl-a. doctor came- “The doctor came.” In order to capture the general tendency that linguistic expressions in closer “proximity” to nouns will be more likely to show agreement ad formam whereas those more distant tend to reflect semantic properties of the agreement target, Corbett proposes the following “agreement hierarchy” (cf. Corbett 1991, 1983): (30) attributive < predicate < relative pronoun < personal pronoun Corbett further notes that the following constraint on agreement systems holds: (31) As we move rightwards along the hierarchy, the likelihood of semantic agreement will increase monotonically (that is, with no intervening decrease). However, it seems clear from our discussion of the motivation behind P&S’s use of indices for keeping a syntactically significant record of modes of individuation, more fine-grained distinctions in the notion of “semantic agreement” are needed, at least for English. For instance, as P&S observe, while ships can be referred to by third person singular feminine pronouns, whatever choice is made must be adhered to for the choice of reflexive pronoun, exactly parallel to the dog example in (4) above: (32) *The boat lurched, and then she righted itself. On the other hand, when it comes to relative pronouns, there is not even a choice to begin with. Here, the only criterion involved is whether or not the referent is human, which is an entirely nonlinguistic distinction, which, as the authors point out, never needs to appeal to index information of the expression involved (cf. P&S 1994: 82): (33) the boat *who/which I like That in English the mode of individuation registered on the index has relevance beyond the domain of anaphor binding can be seen by the following example, which is a variation on a sentence by P&S:22 22
Note, however, that as P&S’s original example shows, a switch of pronoun across clear sentential boundaries is possible: (i) That dog is so stupid, every time I see it I want to kick it. He’s a damn good hunter, though.
Agreement and the syntax–morphology interface in HPSG
243
(34) That dog is so stupid . . . a. every time I see it I want to kick it. b. every time I see him I want to kick him. c. *every time I see it I want to kick him. d. *every time I see him I want to kick it. The fact that it is unacceptable to switch from gender-neutral to genderspecific reference to a dog and vice versa can be seen as direct counterevidence against Corbett’s monotonicity claim about his agreement hierarchy. As the last example shows, personal pronouns are obviously more sensitive to facts that are not strictly semantic, i.e., the specific choice in the way an object is individuated, than relative pronouns are. A similar case can be made for discrepancies between natural and grammatical gender in German, which we will demonstrate with an example sentence from (1991: 228). As is well known, the neuter noun Mädchen (“girl”) can serve as an antecedent to either a feminine or neuter pronoun. (35) Schau dir dieses Mädchen an wie gut sie/es Tennis look you this- girl at how well she-/it- tennis spielt. plays “Do look at this girl, see how well she plays tennis.” Thus, the reason why it is possible to, as it were, “switch” in gender specification between the noun and the pronoun is because there is no strict structure sharing between the index information contributed by either one. Instead, the feminine pronoun sie is licensed by virtue of the fact that the referent is female. This is precisely the behavior we would expect if the choice of pronoun is guided only by strictly semantic criteria. However, the fact that the neuter pronoun is also a valid choice in the example at hand can only be a reflection of some morphosyntactic property of the linguistic antecedent in that sentence because there is presumably nothing intrinsically neuter about a young woman that would warrant this choice of pronoun.23 Of course, the morphosyntactic property involved here is the agreement information on the index contributed by the antecedent Mädchen. So, again, agreement marking on a personal pronoun has been shown not to follow strictly from semantic considerations alone.24 Despite these reasons to question the universality of Corbett’s agreement hierarchy and, concomitantly, details of the proposed scale, we believe that 23
24
Rather, the reason for the neuter gender entirely lies with properties of the diminutive suffix -chen, whose normal semantic import is no longer transparent in this case. The German example does not by itself constitute a counterexample to the specific agreement hierarchy Corbett assumes because that would necessitate that we can show that the agreement behavior of some other element on that scale is provably “more semantic” than that of personal pronouns.
244
his overall insight is correct. We also believe that this intuition is captured more conspicuously by the framework developed here.25 Thus, whenever a selector category makes reference to the value of the category selected, we are dealing with a clear case of agreement ad formam. On the other hand, if an expression with certain features is licensed by anchoring conditions, we have a case of agreement ad sensum. Indices take a middle ground, that is, in the context of an expression that agrees ad formam, they sometimes reflect some “true” semantic property of some expression’s referent, for instance that referent of vous is nonaggregate and thus licenses a singular specification on the corresponding index. In other instances, they keep a record of the mode of individuation for some expression – in which case their purpose to a great extent converges with the one that P&S had in mind for them. The following sentence from Corbett (1991: 226) serves as evidence for the latter case. (36) Sa Sainteté n’est pas ombrageuse de s’en formaliser. his. holiness .is not touchy. of .of.it offend “His Holiness is not so touchy as to take offence.” This sentence can be uttered with a male referent of the noun sainteté. It is clear that the predicative adjective in French does not in general agree with the specifications on the subject; if it did, the form of the adjective in (5) above would have to be belles. Moreover, it is also noncontroversial that, at least as far as gender is concerned, the predicative adjective must not agree in terms of the referent’s actual (natural) gender; for if it did, the form in (36) above would have to be ombrageux. Hence, the only option left is to say that predicative adjectives in French always agree in terms of the index specification and moreover – contrary to P&S – in both number and gender features. This claim is further supported by Corbett’s observation about pronominal agreement in cases of hybrid agreement. Thus, the preferred form (though not the only one, as shown in (37b)) in these cases is elle (cf. Corbett 1991: 277): 25
Unfortunately, at present we do not have a good theory why this should be so, i.e. from what facts it would follow that attributive adjectives should in general be more sensitive to morphosynactic properties of their agreement controllers than, say predicative ones – other than vaguely defined notions such as “proximity.” But note that P&S’s theory does not fare any better in this respect: it is not clear why the agreement behavior of, say, predicative categories should be more likely to be sensitive to anchoring conditions than that of attributive ones, while the latter are likely to make reference to P&S’s encoding of morphosyntactic features, i.e., indices – in fact P&S even make a brief remark about a case of hybrid agreement in Serbo-Croatian, given below, where they do suggest, for the (ib) example, that the attributive possessive can agree with the head noun with female “form” and male referents in terms of anchoring conditions. (cf. P&S 1994: 103) (i) a. naje gazde our-. masters-. b. naji gazde our-. masters-.
Agreement and the syntax–morphology interface in HPSG
245
(37) a. Votre Majesté partira quand elle voudra. your majesty will.leave when she- wishes “You Majesty will leave when he wishes.” b. Sa Majesté fut inquiète, et de nouveau il envoya his- majesty was worried- and again he- sent La Varenne à son ministre. La Varenne to his minister “His Majesty was worried, and again he sent La Varenne to his minister.” Thus, the situation in French is very much parallel to the German Mädchen example we saw earlier. Morphosyntactic information as encoded on an expression’s index can sometimes manifest itself in perhaps somewhat surprising contexts. One example is predicative nominals ending in -in in German. While, in general, nouns such as Lehrer (“teacher”) or Student (“college student”) are ambiguous between a gender-neutral use26 and one that is specifically used for male referents, the corresponding forms ending in -in such as Lehrerin or Studentin are restricted to female individuals. Interestingly, however, predicate nominals ending in -in (such as Bauherrin or Eigentümerin) can also sometimes occur when the subject clearly has no natural gender, as in the following examples: (38) a. Die Samper-Gesellschaft ist Bauherrin der Frauenkirche. the Samper society- is contractor- of.the Frauenkirche b. Die Musiker-Vereinigung ist Eigentümerin des Konzertsaals. the musician association- is proprietor- of.the concert hall What this indicates is that the predicative nominals ending in -in can make reference to morphosyntactic properties encoded in the index of the subject rather than purely semantic ones; in turn it is clear that an approach to this phenomenon analogous to P&S’s treatment of the French polite plural case could not succeed here for precisely the same reason as the Spanish example in (16). Incidentally, it should be noted that the kind of agreement on the predicate seen in (38) is not obligatory, as seen in (39): (39) a. Die Samper-Gesellschaft ist Bauherr der Frauenkirche. the Samper society- is contractor- of.the Frauenkirche b. Die Musiker-Vereinigung ist Eigentümer des Konzertsaals. the musician association- is proprietor- of.the concert hall 26
Despite recent efforts to create a gender-neutral form StudentIn, the use of the nonfeminine form is still pretty much alive, especially in quantified contexts: (i) Jeder Student hat seine Gebühren bezahlt. each Student has his fees paid “Every student has paid his fees.”
246
This observation is in line with the fact that nominal predicates with particular gender specifications are in general quite acceptable with subjects of different gender. In those cases, it is only the subject, never the predicate, that contributes the information relevant for pronominal reference. Examples from German and French in (40) illustrate: (40) a. Herr Meyer ist eine furchtbare Person. Herr Meyer is a- terrible- person- *Sie/Er ist sehr arrogant. she-/he- is very arrogant b. Madame Dupont est un excellent médecin. Madame Dupont is an excellent- doctor *Il/Elle est aussi très gentille. he-/she- is also very nice- Let us now look briefly at how subject–verb agreement fits into our theory. In English, this kind of agreement is not based strictly on morphosyntactic information. This means for the account developed here that for English, the agreement marking on on nouns will never be directly accessed, but only the index. Evidence for this view comes from examples of “reference transfer” or “deferred ostension” mentioned by P&S from Nunberg (1977) (cf. also Nunberg 1993). For instance, there are situations in restaurants in which waiters identify customers with their orders: (41) That hashbrowns at table six wants to pay his check. The apparent mismatch between the number specification on hashbrowns and that of determiner, verb, and possessive pronoun is reconciled if one analyzes hashbrowns as contributing an index marked as sg, i.e. as a manifestation of the fact that the referent of this expression is a nonaggregate entity. In this sense, subject–verb agreement in English can be argued, with Dowty and Jacobson, to be essentially semantic in nature. However, there may be reason to believe that subject–verb agreement is not equally semantics-based in all languages. For instance, while there exists a corresponding waiter jargon in German, the analogous shift in agreement to masculine singular anywhere in the sentence seems completely impossible: (42) *Der Bratkartoffeln an Tisch 7 will bezahlen. the-. home fries- at table 7 wants- to.pay This fact can be explained straightforwardly if we assume that in German the verb always agrees with the subject in terms of the latter’s morphosyntactic features, i.e., the information encoded in .27 In that sense then, subject–verb 27
While it seems that the corresponding sentence with plural agreement is significantly better in acceptability, it nevertheless has a ring of marginality to it which might have to do with the fact that reference transfer in German is best with those cases where the number specifications do not change, compare the following two sentences:
Agreement and the syntax–morphology interface in HPSG
247
agreement in German, similar to our analysis of the French example in (5), is strictly governed by morphology rather than semantics.28 Let us now step back and look at what exactly the role of indices and anchoring conditions is in the account presented here. With P&S we assume that indices are to be treated, in general, as a mere grammatical reflection of the way an object in the discourse is individuated, which itself, of course, is a semantic fact. For the most part then, it would not make much of a difference in the theory if we decided to completely circumvent agreement specification on indices as part of our theory and base agreement solely on anchoring conditions. For example, we could specify that the index associated with the pronoun he in English is to be anchored either to an object whose natural gender is masculine or, as a bound-variable anaphor, can range over a set of individuals of unspecified gender. No mention of any features on the index itself is necessary. However, this will not work in general because there seems to be a constraint in English that the way that some index is anchored must be respected within certain syntactic domains. In (4) above, we already saw an example of a domain in which the particular anchoring condition that was used to refer to the dog must be maintained for the reflexive. A similar phenomenon can be seen with bound variable anaphora. Depending on social awareness, it is possible to have he or they as a bound variable anaphor if the quantification ranges over a set of individuals of unspecified gender: (43) a. Every studenti said that hei thinks that hei should get the prize. b. Every studenti said that theyi think that theyi should get the prize. However, it is not possible to use a different anchoring condition if, logically, the quantification is supposed to involve the same variable: (44) a. *Every studenti said that hei thinks that theyi should get the prize. b. *Every studenti said that theyi think that hei should get the prize. This falls out if the two phenomena are analyzed in terms of structure sharing of indices that bear agreement markings. Let now consider the status of indices vis-à-vis anchoring conditions in P&S’s and our theory. Going back to their analysis of (5) again, such examples (i) a. ?Die Bratkartoffeln an Tisch 7 wollen bezahlen. the- home fries- at table 7 want- to.pay b. Das Jägerschnitzel an Tisch 7 will bezahlen. the-. jagerschnitzel- at table 7 wants- to.pay
28
Obviously, an investigation into the exact nature of the waiter jargon in German (of which the author does not claim to be a competent speaker) is needed. A very interesting mixed case of morphological and semantic constraints is Swahili. Here both verb and adjective agree with the subject and head noun, respectively, according to noun class. However, if this noun denotes an animate object, the agreement marking on adjective and verb will be the same as that of gender class 1/2, which exclusively contains animate nouns. In a sense the semantic condition “animacy” supersedes the morphosyntactic agreement scheme that normally holds.
248
show that for P&S, a female in French in certain utterance-situations (i.e. where they are addressed by means of the polite plural pronoun vous) can be associated with an index marked as plural. Hence, the index does not reflect the way this person is individuated, namely as “feminine singular,” as the agreement on the predicative adjective shows. In our revised theory on the other hand, if there is an anchoring condition that pertains to that specific utterancesituation, it will be directly reflected on the index. As a consequence, we assume that the entry for the pronoun vous does not itself have any gender and number features specified on the index, but instead only as the value for : (45) and information in French vous: . . .| [ pl ] 2 . . .| A different principle applies for determining the information provided by nouns in French. As the example in (36) shows, this category does provide gender information in the index: (46) and information in French sainteté: . . .| sg fem 3 . . .| sg fem As a result, we can say that in the case of French nouns, it is the morphosyntax of the noun which determines the features on the index, whereas for the polite pronoun, the determining factor is the properties of the referent. This situation is somewhat similar to the case of nouns in Spanish, as seen earlier in (16). Here, it is again the referent itself, rather than its linguistic expression that determines what information is provided by the index; hence in the lexicon, the entry for majestad will look as given in (47): (47) and information in Spanish majestad: . . .| sg fem 3 . . .| sg However, note that this will only apply in the case of personal referents. When the reference is to a nonpersonal object, as in the coche case in (17) above, the information will be determined by the morphosyntax of the noun:
Agreement and the syntax–morphology interface in HPSG
249
(48) and information in Spanish coche: . . .| sg masc 3 . . .| sg masc Schematically, we can represent the different factors that are involved here as follows: (49) anchoring conditions
index
morphosyntax
From this perspective, the difference among various languages is a function of which of the two determining factors wins out for what kinds of cases. In Spanish, the generalization seems to be that information is determined in terms of what is encoded in unless there is a personal referent, in which case the general constraint that “natural gender/number determines grammatical gender/number” takes precedence. In French, on the other hand, this rule only appears to apply for polite pronominals.29 Moreover, having an additional level of representation for morphosyntactic information (instead of relying mostly on indices for this) makes it possible to make an important generalization about where reference to anchoring conditions is necessary for the description of agreement. We claim that whenever an index encodes semantically based distinctions, categories which are not in themselves “referential” will exploit this information rather than impose constraints of their own on the anchoring of indices. For the cases we looked at in English, French, German, and Spanish, this means that only nouns and pronouns will need to make reference to anchoring conditions. For other categories, on the other hand, that is, for the most part adjectives and verbs, mention of anchoring conditions will not be necessary. Thus, these categories can be said to be dependent either on purely morphosyntactic properties of agreeing expressions (i.e., agreement with ) or whatever conventions the language has for anchoring indices to objects in the real world (i.e., agreement with ). We further argue that this constraint is natural since indices arise primarily in connection with referential expressions; therefore general constraints between features on indices and what kinds of objects such indices are anchored to should arise primarily in connection with such referential expressions themselves rather than, for instance, predicates over those indices. 29
As Carl Pollard has pointed out to me, this does in effect replicate the disjunctiveness of the P&S-based solution, however, it is pushed into the lexicon, and thus will not have to be dealt with in the syntax. Since we have to assume different anchoring conditions for the index/referent relationship anyway (cf. the feminine indices for ships and female referents), no generalization is lost, but instead, we arguably arrive at a more perspicuous division of labor between the two components than in P&S’s system.
250
Note that this does not mean that, for instance, verbal categories never make reference to the anchoring of indices on their own. One example where this would appear to be necessary is what P&S call “honorific agreement” (1994: 96–101) in Korean. For example, in the following four sentences from Korean, only those in which both verb and subject correspond in the level of honorification are deemed acceptable (cf. P&S 1994: 97): (50) a. Kim sacang-i o-ass-ta. Kim President- come-. “President Kim has come.” b. Kim sacang-nim-i o-si-ass-ta. Kim President-. come-.. o-si-ass-ta. c. #Kim sacang-i Kim President- come-.. o-ass-ta. d. #Kim sacang-nim-i Kim President-. come-. Since P&S assume that honorification in general is to be treated in terms of background conditions on indices, facts about anchoring of indices will have to be mentioned in the entry for the verb. However, note that it is far from clear whether the index itself will encode information pertaining to honorification in the same way that indices in, say English, encode information about person, number, and gender. Therefore, honorification does not represent a counterexample to our proposal. To conclude this section, we have shown that the problems with P&S’s accounts for hybrid agreement can be overcome if their distinction between the status of and on the one hand and (and -) on the other is collapsed. In terms of P&S agreement ontology then, our system does not make a distinction between concord and agreement. Rather, all morphosyntactic agreement – and, in a sense, our account of semantic agreement too – can be viewed as concord. For languages like French, this leaves us with only two explanatory devices to account for the various patterns of feature distribution on nonnominal expressions, while P&S need three: pragmatic agreement, index agreement, and feature concord.
5
Formalization of agreement constraints
One of the distinctive features of the HPSG formalism is that it allows us to state linguistic generalizations in terms of sorts. A way to utilize sorts is to localize particular aspects of linguistic knowledge that cluster together. In this light, we want to propose a treatment of different kinds of agreement patterns as separate sorts. Moreover, these kinds of sorts are not just monolithic pieces of linguistic information, but they bear particular relations to each other. Thus, one sort can be a subsort of another, hence the first will be a
Agreement and the syntax–morphology interface in HPSG
251
more specific, more informative cluster of information. Two sorts can inherit from the same supersort and hence share certain properties while differing in others. In the case of agreement, we want to distinguish between two domains of knowledge that are both structured in sort hierarchies: • •
what kinds of agreement features are appropriate for a particular category; what kinds of cooccurrence patterns these agreement features exhibit, in our terminology: what kinds of “agreement patterns” there are.
The first will be specified in terms of which features are appropriate for . In this study, we will not be concerned much with this issue (but cf. Kathol and Kasper (1993) for some suggestions). The second kind of information is less local. It relates the information contained in of a selector category with the specification on some selected category. As we will see shortly, there may be reason to assume that the patterns of covariation involved here may actually differ from each other in terms of what part of the feature geometry is involved.
5.1
Agreement patterns
In this section, we will discuss the centerpiece of the morphology–syntax interface proposed here, namely the sorts that specify various distribution patterns of features, the “agreement patterns.” We will discuss two subcases which can be distinguished according to the size of the domain in which such patterns apply: NP-internal agreement (we will be chiefly concerned with a description of determiner/adjective–noun agreement in German) and subject– verb agreement in various languages. Both cases, however are instances of “selector–selectee agreement” (and hence the agreement constraints can be specified lexically on the selector category alone), as opposed to nonlocal agreement such as between an anaphor and its antecedent in discourse (which, as we showed above, tends to be driven more by semantic constraints such as anchoring conditions). 5.1.1 NP-internal agreement in German Starting with adjectives and determiners, one question that comes immediately to mind is on which level within the information structure of a sign the agreement pattern is to be stated. We suggest that can be best done locally on values of the feature. The reason for this is that the categories selected by adjectives and determiners are values of the features or , respectively. Thus, given the feature architecture of P&S (1994), the minimum level containing both the selector’s features and the selectee’s category specification is within the value of the selector’s feature. This allows us to state a (somewhat speculative) generalization: agreement stated on the level as a more “local” (in terms of the geometry of the feature structures for
252
selector categories30) sort of agreement is more likely to be expressed drawing from a common set of inflectional morphemes than “less local” agreement patterns (which, as we will see in the next section, have to be expressed on the level of ). For the case of agreement between adjectives and nouns, this seems to be true for languages as diverse as Latin, German, and Swahili. And for German, determiner–noun agreement works that way too. The agreement pattern we want to state for attributive adjectives in German is very regular, involving the agreement subsort nom(inal)-agr(eement): (51)
adj-agr-pat nom-agr |1 [||1]
Since nouns do not inflect for person31 – instead, person information is encoded as part of the specification – agreement between adjectives and nouns varies along the same set of dimensions. There are alternatives to this arrangement; for instance we could take nom-agr to be appropriate for the same dimensions of variation as indices, with the addition of and , given in (52).32 But then it is not clear why should never play a role for NP-internal agreement: (52)
agr
One complication with respect to different occurrence pattern of adjectives needs to be addressed. Apart from their attributive use, adjectives can also occur predicatively. The two are semantically distinct. Assuming the treatment in P&S (1994), an intersective attributive adjective contributes an additional restriction on the head noun’s index (non-intersective adjectives are treated in a different way), whereas predicative adjectives instead contribute, like verbs, only a relation that has the index as its argument. On the one hand, we need to get the right kind of distribution of adjectives with and without agreement inflection, 30
31
32
Thus, this sort of “local agreement” is a subtype of the kinds of “local agreement” phenomena Zwicky (1987) wants to distinguish from “anaphoric agreement.” Since Zwicky also wants to treat certain cases of subject–verb agreement as local, the correspondence is not complete. Of course, pronouns differ with respect to their person value, but it seems odd to consider pronouns as realizing person information in the same way that distinctions in person are reflected on different finite verbal forms. This architecture was suggested to me by Ivan Sag (p.c.).
Agreement and the syntax–morphology interface in HPSG
253
and on the other hand, we still would like to be able to state the form of an adjective independently of its specific occurrence as either of the two possibilities. One possibility would be to relate the two forms by a lexical rule. Another solution would be in terms of a sort inheritance from a common “proto-sort” along the lines proposed in Kathol (1994). Instead, we pursue another line here which promises to yield a more satisfactory general framework for both inflectional and derivational morphology, namely in terms of (here: unary) word formation schemata (cf. Kim 1993 and Riehemann 1993). We already saw one example of (the output of ) such a schema mediating between stem and inflected form in (20) above. The central idea is the same here, except that a single adjectival stem will have two kinds of inflected forms, one for the attributive and one for the predicative variant, which, as we already pointed out, need to be distinguished semantically. The specific information contained in the att-adj-word sort, which can be seen as a word formation schema taking adjectival stems and yielding fully inflected attributive adjectives is given in (53):33 (53)
att-adj-word PF(4,5) stem 4
- 8 〈1〉 | 7 2 - 8 | 7 adj-agr-pat 1 | 3 5 1 {2}ø3
33
From now on, we will assume that valence is expressed in terms of the (syntactically potent) valence features and , while - (cf. Manning and Sag 1995) serves to represent a lexical element’s full argument structure. This list is relevant, for instance, for establishing obliqueness relations. This move becomes important for the problem of providing antecedents for reflexives within complements of an adjective, as in (i): (i) a. Die Professoren sind auf sich stolz. the professorsi are of selfi proud “The professors are proud of themselves.” b. die auf sich stolzen Professoren the of selfi proud professorsi “the professors that are proud of themselves” Note also, that we will assume that, in particular, the feature is not appropriate for attributive adjectives.
254
What this feature description in effect says is that given the information specified in the lexicon for an adjectival stem, the attributive form will embed that information in the ways specified in terms of the semantics (), syntax (), and morphology ( and ). Since the resulting form has its value sortally specified as adj-agr-pat, it is automatically ensured via the agreement constraint above in (51) that the attributive adjective agrees with the f it modifies. Let us now turn to predicative adjectives. Here, the situation is similar in that the information provided by the stem is “fed” into the top-level specifications. However, the specifics are somewhat different. (54)
pred-adj-word PF(4,5) stem 4 - 8 | 7 2 - 8 7 | [ 5none] 2
First of all, note that predicative adjectives are not in a modifying, but rather in a predicative relationship with the nominal, which syntactically in HPSG is treated in a way similar to that of raising verbs. This means that they will not provide a restriction on an index, as in the attributive case, but instead they contribute a relation. Moreover, in German, predicative adjectives always have the exact same form, regardless of the features of the subject. This is expressed in our system first by the absence of any agreement pattern involving subjects and predicative adjectives, and second by way of a special value of the feature “none” that encodes the lack of agreement controller.34 On the other hand, in French, where predicative adjectives also agree with the subject, the predicative word formation schema would look as follows, where a subsort of cat, viz. pred-agr-pat, ensures structure sharing between the adjective’s features and those of the index of the subject NP:
34
Alternatively, we could have chosen to let the predicative adjective have no feature at all. Though this seems to reflect our intuitions about the facts somewhat more adequately, it would necessitate a more complicated interface between syntax and morphology, as the spell-out would have to check for the presence or absence of agreement information in the head.
Agreement and the syntax–morphology interface in HPSG
(55)
255
pred-adj-word PF(4,5) stem 4 - 8 | 7 2 pred-agr-pat - 8 |
9 0 9 5| 0 2 7| . . .|
Coming back to the German case, note that the two word formation schemata nevertheless look very much alike, thus it seems desirable to capture the similarities by means of a common sort (cf. (57)) that both inherit from: (56)
adj-word attr-adj-word
(57)
pred-adj-word
adj-word PF(4,5) stem 4 . . .|- 7 . . .|| 8 - 7 . . .| | 8 | 5
The amount of nonpredictable information that will have to be associated with a stem to yield the two forms is also fairly sparse; all the rest is done by the respective word formation schema. Cf. the entry for the stem rot- (“red”) below: (58)
rot 〈rot-〉 |
| 〈 〉 red
256
Let us now come back to (57) and take a closer look at the way inflectional features and morphological form constrain each other. In other words, what does the paradigm function look like? In Stump’s system, in getting from feature-value pairs to actual morphological forms, a great deal of use is made of defaults and overrides. Since, in general, HPSG has been shying away from such nonmonotonic devices, one conclusion one might draw is that Stump’s system is inappropriate or not available as a theory of morphology for HPSG.35 However, one could also take the opposite view, namely one should try to preserve as much as possible the generalizations Stump is able to express in his more flexible system, even though this might not be possible to do within the framework of HPSG itself.36 In this study, we tend to lean toward the second position, i.e. that one should be prepared to dissociate the morphology–syntax interface from the actual specifics of morphological expression, at least until it is clear how to incorporate Stump’s insights into a feature-based representation of linguistic knowledge such as HPSG.37 On the other hand, note that the syntax–morphology interface per se is completely orthogonal to the question of how to capture regularities of morphological expression, hence at least for the issue of agreement the details of the latter are irrelevant. Thus, for the purposes here, a simpleminded listing of matches between agreement features and morphological forms would suffice. A somewhat more sophisticated example of how to mediate morphological expression which we will assume here for the purposes of exposition involves clustering forms that are alike in the way suggested in Kathol (1994):38
35
36
37
38
Calder (1994) argues that one has to distinguish between two kinds of defaults. First there is the “unconstrained” use, that is one “in which defaults play an essential role in determining the combinatoric possibilities determined by a grammar or where there is no domain in which defaults are final” (p. 7). The other is one in which defaults are used as an abbreviatory device. At least as far as inflectional morphology goes, it seems fairly clear that defaults in the description of the feature-value pairs/form mapping are not of the first type, i.e. the combinatorial possibilities are unaffected by the kinds of defaults that Stump uses, and moreover, since we want to assume here some version of the principle of “morphology-free syntax,” defaults are clearly final in the morphological domain, i.e., no morphological default will ever play a role in the domain of syntax. This is the reason that morphological defaults are nonessential, and thus, strictly speaking, superfluous for stating the morphological spell-out of inflectional features. For instance, Erjavec (1995) proposes an implementation that relies on -style lexical rules (cf. Carpenter (1992) ), which exploits procedural aspects of the implementation resulting in an approach more faithful to Stump’s original model than what could be achieved using the HPSG description language alone). Cf. Bird and Klein’s work on how to bring phonology into the scope of feature-based formalisms. Note that we assume the relationship between features, stem, and form to be functional, rather than relational as in Kathol (1994).
Agreement and the syntax–morphology interface in HPSG
(59)
257
nom-agr 1 PF 5, 2 3 4 1 = sg & 3 = (gen ∨ dat) & 4 = weak ∨1 = pl & 4 = weak = 5 + -en iff ∨1 = sg & 2 = masc & 3 = acc ∨1 = sg & 2 = (masc ∨ neut) & 3 = gen & 4 = strong ∨1 = pl & 3 = dat & 4 = strong 1 = sg & 2 = fem & 3 = (nom ∨ acc) ∨1 = sg & 2 = masc & 3 = nom & 4 = weak = 5 + -e iff ∨1 = sg & 2 = neut & 3 = (nom ∨ acc) & 4 = weak ∨1 = pl & 3 = (nom ∨ acc) &4 = strong = 5 + -es iff 1 = sg & 2 = neut & 3 = (nom ∨ acc) & 4 = strong 1 = sg & 2 = fem & 3 = (gen ∨ dat) & 4 = strong = 5 + -er iff ∨1 = pl & 3 = gen & 4 = strong ∨1 = sg & 2 = masc & 3 = nom & 4 = strong = 5 + -em iff 1 = sg & 2 = (masc ∨ neut) & 3 = dat & 4 = strong PF(5, none) = 5
The function in (59) implements precisely the paradigm for adjectives in German, which is given in (60) (from Zwicky 1986a) for reference: (60)
decl. type
case
msc-sg
neut-sg
fem-sg
pl
strong
nom
-er
-es
-e
-e
acc
-en
-es
-e
-e
dat
-en
-en
-er
-er
dat
-em
-em
-er
-en
nom
-e
-e
-e
-en
acc
-en
-e
-e
-en
gen
-en
-en
-en
-en
dat
-en
-en
-en
-en
weak
Turning to determiners now, we propose to treat them essentially in the same way as adjectives, except that we do not have to pay attention to the
258
attributive/predicative distinction and the resulting semantic complications. One difference, though, is that in the case of determiners, the selected element is accessed via and that is not one of the features involved in the agreement relation between determiner and f:39 (61) det-agr-pat
1 | 2 3 n-agr || 1 2 3
The reason that is left out as one of the agreement features of determiners is that if one examines the distribution of the values for this feature within an NP, a pattern of (morphologically conditioned) feature governance emerges: Zwicky’s Class II determiners, i.e. those with a full inflectional paradigm such as the definite article der or the demonstrative dieser, always trigger weak adjectival (and nominal, where expressed, as in Verwandter (“relative”)) inflection while the latter takes strong forms if there is no determiner present at all: (62)
No determiner:
weak adj. forms
strong adj. forms
num
case
sg
nom
dies-er lieb-e
acc
dies-en lieb-en Verwandt-en
gut-en Wein
gen
dies-es lieb-en Verwandt-en
gut-en Wein-es
dat
dies-em lieb-en Verwandt-en
gut-em Wein-(e)
nom
dies-e
lieb-en Verwandt-en
gut-e
Wein-e
acc
dies-e
lieb-en Verwandt-en
gut-e
Wein-e
gen
dies-er lieb-en Verwandt-en
gut-er Wein-e
dat
dies-en lieb-en Verwandt-en
gut-en Wein-en
“this/these dear relative(s)”
“good wine(s)”
pl
39
Class II determiner:
Verwandt-e
gut-er Wein
We will not distinguish at the level of agreement patterns between those determiners for which a certain feature such as is inherently specified (e.g. pl for numerals such as zwei) and those that allow variation along this parameter (e.g. the definite article (der)).
Agreement and the syntax–morphology interface in HPSG
259
Moreover, those determiners that do not exhibit any inflection at all, such as the numeral fünf (a representative of Zwicky’s Class I), pattern with the second column in (62), that is, they consistently trigger strong adjectival marking on the adjective (and noun, where expressed): (63)
Class I determiner: case
strong adj. forms
nom
fünf gut-e
Wein-e
acc
fünf gut-e
Wein-e
gen
fünf gut-er Wein-e
dat
fünf gut-en Wein-en “five good wines”
Finally, there are determiners of Class III which participate in “mixed declension” within the NP. Such determiners – the indefinite article ein is an example – show no inflection in their singular masculine nominative and neuter nominative/accusative forms. In precisely those cases, the adjective shows strong declension, while with all other forms, i.e. in all those forms where the determiner is inflected, the adjective is inflected as weak: (64)
masculine
case
neuter
nom
ein-¼
lieb-er Verwandt-er
ein-¼
schnell-es Auto
acc
ein-en lieb-en Verwandt-en
ein-¼
schnell-es Auto
gen
ein-es lieb-en Verwandt-en
ein-es schnell-en Auto-s
dat
ein-em lieb-en Verwandt-en
ein-em schnell-en Auto
“a dear relative”
“a fast car”
We want to propose to treat this initially bewildering distribution of declension-type features in terms of two simple constraints. The first accounts for the fact that determinerless NPs are always marked as strong : (65)
noun-cat ||| 1 2 Constraints: 1 = strong if 2 = 〈 〉
260
This constraint, expressed at the level of , presupposes a treatment of determinerless NPs such as those containing mass nouns or bare plurals in terms of a disjunctively specified valence feature for the selection of determiners by nouns (cf. P&S 1994: ch. 9). If the disjunction is resolved to an empty list as the value of the feature, the constraint above will automatically instantiate the attribute on the noun to strong.40 The second constraint embodies a generalization about the occurrences of weak values within the f: the presence of a determiner will always trigger weak markings, except when the determiner is not inflected itself.41,42 Because in our system, the lexical representation of a lexical form contains both the phonology of the inflected form and a record of the stem’s phonology, the presence of inflection can very simply be described in terms of an inequality statement between the two phonology features: (66)
det-word 1 |2
. . .|| . . .| nom-agr 3 Constraints: weak iff 1 ≠ 2 3= strong otherwise
40
41
Incidentally, we will assume here that the properties of a nominal projection relevant for its phrasal status are different from those determining the level of saturation for adjectives and determiners. Thus, a maximal NP (and hence the kind of thing selected by, say, verbs) is a nominal projection with an empty value. On the other hand, NP-internally we will assume with Netter (1994) that determiners qua their feature ( just as complementizers) impose a marking value on the f, thus barring it from further specification as well as modification by adjectives. This will then be our approach to the “yellow the cats” problem. As Ivan Sag (p.c.) points out, however, it is not immediately clear how this analysis will construct the right semantics for examples like “alleged murderers.” This generalization also extends to prenominal genitive NPs which also cannot express the whole NP’s feature specifications: (i)
case nom
meines
Vaters gut-er
Wein
acc
meines
Vaters gut-en
Wein
gen
meines
Vaters gut-en
Wein-es
dat
meines
Vaters gut-em Wein-(e)
“my father’s good wine” 42
As far as we are aware, there seem to be two exceptions to this otherwise completely consistent generalization. The numerals zwei and drei (but none of the other numerals) can (optionally) show inflection if the NP is marked as genitive. In such cases, however, despite the inflection, the f is nevertheless strong in its value, cf.:
Agreement and the syntax–morphology interface in HPSG
261
As far as we are aware, no other treatment of declension type in German has so far been able to express this connection between morphology and feature governance equally succinctly. In particular, Netter (1994) has to stipulate a sortal distinction between infl-det and uninfl-det, but more importantly, the relationship between these sorts and each one’s morphological expression is accidental. On our analysis, on the other hand, the governance properties of a determiner are directly correlated with the properties of its inflectional paradigm.43 Note also that Netter’s and our treatments differ with respect to the analysis of what appear to be Class I determiners occurring inside of other determiners (and adjectives) (cf. Netter 1994: 251, 256): (67) a. zwei liebe Verwandte two nice-. relatives-. b. meine liebsten zwei Verwandten my-. most.favorite-. two relatives-. c. meine zwei guten Freunde my-. two good-. friends- Netter assumes that the examples in (67a, b) show that the determiner zwei must be assumed to be “transparent” for declension markings as it is able to appear both within NPs marked strong and those marked as weak. Yet, another analysis seems plausible here, namely one that assumes that numerals in general have a determiner and an adjective variant. The otherwise strange occurrence of zwei in (67b, c) above after the determiner meine would involve their adjectival form while in the NP-initial position in (67a), zwei acts as a determiner. As adjectives, numerals of course do not govern the distribution of declension markings, but simply agree with the f they are part of – if they inflect at all, which not all adjectives do (cf. for instance the noninflectable adjective lila (“lilac”)). Moreover, if we assume that as adjectives, numerals obligatorily select an f with a nonempty value, we have an account for why examples such as (68) are bad, in which the determiner is missing. It is not clear what would rule this out on Netter’s account. (68) *liebe zwei Verwandte nice-. two relatives-.
(i) a. der Puls dreier kräftiger Männer the pulse three-. strong-.. men-. b. der Puls dieser kräftigen Männer the pulse these-. strong-.. men-. 43
We have no explanation for this fact. While it is often claimed that the conceptual generalization about the distribution of the values within an NP is that a strong value must be expressed on either the determiner or the f, but never both, note that it is far from clear whether Netter’s approach succeeds in implementing this intuition more directly than our, extensionally equivalent, treatment.
262
Note also that as a byproduct of this constraint, when a numeral occurs NPinitially as in (67a), the alternative analysis as an adjective in a bare plural NP is not available, thus no spurious ambiguity problem will emerge. Finally, the double analysis of numerals as determiners/adjectives offers a straigtforward explanation for the (weak) adjectival inflectional paradigm exhibited by ein when it occurs NP-internally, where it is arguably a numeral with adjectival properties: (69)
case nom
der ein-e
acc
den ein-en Verwandt-en
gen
des ein-en Verwandt-en
dat
dem ein-en Verwandt-en
Verwandte
“the one relative” If we put the entire discussion so far together, what we get is a much more perspicuous account of the feature distribution within German NPs than in P&S’s original theory. In fact, due to the agreement patterns between determiners, adjectives and nouns, the agreement-related structure sharing among these participants within an NP such as ein fleißiger Beamter (“a busy (state) official”) will be as outlined below:
(70)
[|5] [5] 1 | 2 3 ein
[|5]
[5] |5
fleißiger
1sg 2masc |5 3nom 4weak Beamter
It may be helpful to point out that since Beamter is one of only a handful of nouns in German that actually exhibit inflection for declension type, one might want to treat these nouns in a special way; to put it another way, since all other nouns never show declension type inflection, it would seem natural not to include this feature for the agreement patterns of the majority class
Agreement and the syntax–morphology interface in HPSG
263
of German nouns. However, we want instead to take a somewhat different position here, namely that all nouns in German, regardless of morphological distinctions, contain information as the value of their feature. The reason is that it allows us to have a completely uniform system and to treat the question of whether or not declension type is expressed as part of morphology, rather than the syntax–morphology interface. Consider the alternative. If instead we had different agreement patterns, say one for nouns like Beamter and one for the rest, we would in essence replicate in the agreement patterns a fact about the morphosyntactic features encoded on in each class. There seems little to be gained from this duplication of effort.44 But more importantly, if we assume that there are subcategories of nouns according to whether or not is an appropriate feature in their agr subsort, one would predict that – at least in principle – this dichotomy should somehow manifest itself syntactically in other domains. However, not only do we not find such cases, instead it seems to be a very general fact that partitions into subclasses of categories according to whether or not a certain agreement feature is morphologically expressed never plays a role outside of morphology. This seems to hold very generally (cf. subject agreement on modal auxiliaries in English or gender agreement on nonpast verbs in Russian). Therefore, we will keep the agreement patterns themselves maximally general, representing all agreement features that are relevant for any subcategory of the category in question. Finally if we compare the NP-internal agreement behavior of German with that of English, we notice again a difference in the kind of information that is relevant. In what could be argued to be the remaining few cases of true NP-internal number agreement in English, viz. the distinction between this/that and these/those, semantics supersedes morphosyntax, as evidenced by the example in (41) above. One possible (speculative) explanation might be that since only one feature – – is involved, its behavior is likely to be in accordance with the other agreement patterns in the language, e.g. subject– verb agreement, which have been shown to be more semantically driven. In languages like German, on the other hand, there are already two features which presumably are not correlated with semantic constraints, viz. and . Given that the morphological realization of the nominal features – , , , – is virtually entirely fusional in German and hence no correlation between a single feature and its expression can be established, it seems plausible that the other features – and – would also behave in a way that is sensitive to what are in essence morphosyntactic distinctions, albeit often indistinguishable from the associated semantic properties. 5.1.2 Subject–verb agreement Turning now to subject–verb agreement, we can ask the same question that we did at the beginning of the section on NP-internal agreement: where should agreement patterns be stated? Verbs, as well as any other types involving 44
Arnold Zwicky (p.c.) drew attention to this point.
264
agreement with arguments that are represented on valence features45 will have agreement patterns stated on values of the feature. Starting with English, we assume that there is only one agreement pattern that is common to all verbs via sort inheritance, regardless of whether the (sub)paradigm makes explicit distinctions for agreement (as in the present tense and the past tense forms of be) or not. As discussed earlier, the information crucial for determining agreement marking on the verb is that provided by the subject’s index: (71) Subject–verb agreement (SVA) in English fin-agr-pattern . . .| 1 2 nom-agr . . .| nom . . .| 1 2 In German, on the other hand, the situation is somewhat more complicated. Here, we have to distinguish between two kinds of patterns: one for impersonal constructions and one for personal ones, the latter one sensitive to morphosyntactic information on the subject nominal, as shown in (72).46 (72) SVA in German fin-agr-pattern personal/impersonal impersonal . . .| sg 3 ¬ 〈[]〉
45
46
personal . . .| 1 2 nom-agr . . .| 1 nom . . .| [2]
Valence attributes are features that are grouped under (, , and in P&S 1994) and whose values are reduced by the combination with some matching element (unlike in the case of or ). Implicit in the definition for the personal agreement is the assumption that – pace Pollard (1996) – finite verbs select their subject via the rather their feature. The main reason for this move is expository uniformity. While there are some empirical issues involved, note that for our purposes, it would present little effort to tailor the agreement system to Pollard’s conception of subject selection in finite and nonfinite verbs.
Agreement and the syntax–morphology interface in HPSG
265
By general sort inheritance, each finite verb will inherit the constraints stated in fin-agr-pattern, or, more precisely, one (and only one) of its subsorts (which is why fin-agr-pattern is partitioned into the two subsorts). In the personal case, as discussed earlier in section 4.1, we have reason to believe that this pattern involves morphosyntactic covariation. We also state the fact that subjects of finite verbs are marked with nominative case, in addition to the subject– verb agreement constraints. The impersonal subcase applies mostly whenever the list is empty, as in true impersonal constructions. However, it also extends to situations in which the subject is not an NP, such as with sentential subjects – provided that the clause is indeed a subject, which we will assume here for simplicity. Interestingly, the fact that every lexical verb inherits from the agreement pattern sort does not mean that for every verb, the agreement pattern is fixed. Notable exceptions, for example, are raising verbs. On an analysis of raising verbs along the lines suggested by Hinrichs and Nakazawa (1990), the presence or absence of a subject for the raising verb depends entirely on whether or not the governed verb itself takes a subject: (73) Raising in German 1 1 2 % 2 If it does not, as in the case of lexical impersonals as in (74a), or impersonal passives, as in (74b), the whole clause will not have a subject either, hence the impersonal agreement pattern is correctly instantiated. (74) a. Mir droht schlecht zu werden. me- threatens-3. sick to become “I feel like I’m getting sick.” b. Gestern scheint wieder getanzt worden zu sein. yesterday seems-3. again danced been to be “It seems that there was dancing again yesterday.” Incidentally, the very same account also covers more complicated cases, such as “quirky subjects” in Icelandic. The crucial difference between the two languages is that unlike German, Icelandic permits NP arguments that can be shown to pass all the relevant tests for subjecthood (in particular, control of nonfinite VPs), but surface with a case other than nominative.47 In the presence of quirky subjects, the inflection on the verb is invariably 3rd sg, while nominative subjects always trigger agreement. Again, as in the German sentential subject cases, since the ¬〈[]〉 is met, the impersonal agreement pattern applies, whereas the presence of a nominative NP obligatorily forces agreement with the finite predicate. 47
Cf. Andrews (1982) and Sag et al. (1992).
266
Contrast the proposed treatment via sort inheritance now with an account of subject–verb agreement and nominative case assignment that relies on lexical rules, as e.g. the one outlined in P&S (1987). The following lexical rule from Kiss (1992: 159) shows how additional machinery has to be introduced such as functional dependencies that test for the presence of a subject:48 (75) Kiss’s Lexical Rule for Finiteness and Subject–Verb Agreement in German fin inf
⇒ sva(1) 1 2 nom(1)!2 Constraints: 3 if 1= :5 3 4 4 sva(1) = sg otherwise 3rd Other than that, it seems that the only way to get the distribution of agreement and case-marking right is to spell out the entry in (73) into two subentries: one for the personal and one for the impersonal case, so that the right lexical rule can apply in each case. There is no reason for us to assume that such a treatment would be superior to the one we propose here.49 At this point we should also mention that this approach affords a somewhat different view of the Italian example in (18) above, repeated below: (18) Si è orgliosi di se stessi. is-3. proud-. of one selves-. “One is proud of oneself.” Instead of treating si as the true subject here that triggers agreement on the copula, we can alternatively consider this sentence as an instance of a true impersonal construction, i.e. one without a syntactic subject.50 Since we adopt Manning and Sag’s (1995) distinction between valence features and argument structure, we can now think of the clitic si as the subject, which, however, is not realized via , but instead by means of the clitic store, . The least oblique element on orgliosi’s - list – Manning and Sag’s a-subject – will then serve as the controller of the agreement morphology of the adjective and concomitantly as providing the antecedent for the reflexive se stessi. The 48
49 50
Kiss makes a number of assumptions entering into this rule which we do not necessarily share. Most important for understanding his rule is that for him, finite and nonfinite forms in German are crucially different in their argument structure (cf. also the previous footnote on this point). While subjects of nonfinite verbs are not represented on the list (but instead on the list (cf. also Borsley 1989, Pollard 1996), which he considers to be a feature), they are elements of the list of finite verbs. See also Müller (1997: 58) on the same point. It should be pointed out, however, that P&S’s approach is compatible with a nonpersonal treatment of this example.
Agreement and the syntax–morphology interface in HPSG
267
lexical specifications for the copula, taking as its complement the impersonal version of the predicate orgliosi (cf. also Monachesi 1993) would then look roughly as follows: (76)
〈 〉 〈3,2〉 〈2〉 〈 〉 ,2 [di ] - 1,3 - 〈1,2〉 {1} 1 si-index pl masc 3
Assuming an agreement pattern for verbs along the lines outlined for German above in (72), subjectless constructions will automatically trigger 3rd, sg markings on the verb. Note that the index provided by si itself bears 3rd pl masc features. Since this is what will determine the morphology on the predicative adjective as well as provide an antecedent for the reflexive, the sentence is correctly accounted for. Note that these particular features on the index also manifest themselves in other example sentences containing predicative constructions from Monachesi (1993): (77) Si è spesso dimenticati. is-3. often forgotten-. “One is often forgotten.” (78) Si vive contenti. lives-3. happily-. “One lives happily.”
5.2 The treatment of tense, mood, etc. Notice that our account of inflection on verbs has only gotten us part of the way because we have as yet not provided any means for bringing in other categories commonly thought of as inflectional, viz. tense and mood (and for some languages, voice, polarity etc.) as input into the morphological component. We want to argue that there is an important difference between agreement features and the other classes of inflectional categories. While the first contribute information about either the index or the morphosyntactic specifications of one of the arguments (mostly the subject), the other inflectional categories are correlated with a nontrivial contribution either to the whole form’s semantics (as in tense, mood, or aspect) or argument structure (as in voice). To capture this, we will assume first, that these contributions are introduced in the way of word formation schemata, and second, that these schemata will provide feature specifications which can be accessed by the
268
morphological component, that is, the paradigm function. But since there is no reason to assume that in the languages considered, tense and mood trigger the kind of covariation seen with the “true” agreement features, we have been assuming a separation between agreement features () and morphosyntactic features in general (). On the other hand, at least mood appears to be a category accessible to governance in the syntax, that is if we treat the base morphology on the verb in the following well-known examples as an expression of (present-)subjunctive morphology (cf. P&S 1987: 54): (79) a. I demand that he leave tomorrow. b. *I demand that he leaves tomorrow. c. *I demand that him leave tomorrow. Despite the nonfinite form, the fact there is a syntactically overt subject and moreover that this subject must be nominative in case is a strong indication that the verb actually patterns with regular finite forms here; hence we will treat the peculiar form as a case of morphological syncretism with the “true” nonfinite (base) form. Thus the situation here is no different from the one with modal auxiliaries which do not exhibit morphological distinctions according to the features of the subject either. To incorporate tense and mood into our system, we simply take a representation like the one in (20) and expand the set of features under to also include nonagreement information relevant for the morphology. In particular, tense and mood information in English will be specified as the value of a new feature, -. Our revised version of (20) then looks as follows: (80)
PF (4,5) = 〈walks〉 stem 4〈walk-〉 . . . finite
|5 1 3rd 2 sg - present - 〈[]3〉 walking 1 | 3 2 7 -8 temporal-overlap |- 17 28
Agreement and the syntax–morphology interface in HPSG
269
The feature description above also incorporates, in a slightly modified way, the treatment of tense in terms of a contextual restriction on the spatio-temporal location associated with a state of affairs as proposed in P&S (1987: 192).51 Singling out the contribution by the tense, we can represent the word formation schema that will instantiate, say, present for the feature - on finite forms as follows: (81) morph-complex PF(4,5) stem 4 . . . finite | 5 - present . . . | 7 -8 temporal-overlap |- 17 28
Furthermore, we assume that the values of the feature are sorted, each subsort with a possibly different set of appropriate features. The relevant part of the sort hierarchy for English is given in (82): (82)
verb-morsyn finite -
nonfinite
Thus, while we still have the feature from GPSG and earlier versions of HPSG, the function here is somewhat different, as its sole purpose is to distinguish among the nonfinite forms infinitive, present-participle and past-participle as the verbal forms that modal auxiliaries (e.g. can), perfect tense auxiliaries (e.g. have), and forms of be select, respectively.52 As a further consequence, nonfinite verbs in English have neither a specification for 51
52
Thanks to Bob Kasper for pointing out that utterance time can be incorporated as one of the contextual indices. In the German literature on the different nonfinite verbal forms and their distribution, the common term is Status.
270
agreement nor for tense-mood. For finite forms, the latter system does not appear to be completely transparent any more, in the sense that true tense distinctions only apply to indicative forms, while the subjunctive forms seem to represent combined tense-mood information. If this turns out to be the correct way of looking at the system of tense and mood in English, the following would be the corresponding sort hierarchy for the (atomic) values of the - feature:53 tns-mood
(83) pres-subj = “base”
past-subj
indicative present
past
Besides -, other inflectional, but not agreement-related categories include polarity (as e.g. in Swahili – cf. Stump (1992) for a treatment of the complications originating from portmanteau morphemes for polarity and subject agreement), voice (e.g. in languages such as Swedish that mark passive directly on the verb, rather than via a periphrastic construction as in English or German), aspect, and others. The way that these word formation schemata interact with each other is illustrated for German in (84), which shows how tense and mood partition the space of nonagreement morphosyntactic features, such that each maximal subsort corresponds to a word formation schema specifying the morphology as well as semantics of the resulting form for both tense and mood: (84)
fin-v-formation tense present
mood past
indicative subjunctive
Before we close this section, let us make some brief remarks about how the paradigm function spells out the feature specifications contained in . The full inflectional paradigm of regular finite verbs in German is given below, exemplified by the verb lieben (“to love”).54
53
54
As Robert Kasper has pointed out (p.c.) this sort hierarchy does not reflect a morphological distinction between base form and to-infinitive. Rather we want to treat the latter by means of a selectional marking (e.g. ) that the auxiliary to imposes on the VP, similar to the way a preposition such as of marks its phrase as for purposes of selection of prepositional objects. Note that we will not be concerned about such details as the predictability of the epenthetic /-e-/ in the second person plural forms.
Agreement and the syntax–morphology interface in HPSG
(85)
tense
number
person
indicative
subjunctive
present
sg
1st
lieb-e
lieb-e
2nd
lieb-st
lieb-est
3rd
lieb-t
lieb-e
1st
lieb-en
lieb-en
2nd
lieb-e(t)
lieb-e(t)
3rd
lieb-en
lieb-en
1st
lieb-t-e
lieb-t-e
2nd
lieb-t-est
lieb-t-est
3rd
lieb-t-e
lieb-t-e
1st
lieb-t-en
lieb-t-en
2nd
lieb-t-et
lieb-t-et
3rd
lieb-t-en
lieb-t-en
pl
past
sg
pl
271
As a first crude approximation, we can state the spell-out of the different feature combinations along the same lines as for adjectives in (59) above. As before, the following should not be mistaken for a full-fledged morphological account.55 As an additional feature of the spell-out below, we incorporate the insight that the endings for past forms (indicative or subjunctive) are precisely the same as the ones for present subjunctive, except that in the past, there is an additional -t- formative. The occurrence of such elements is probably a good indication that an ultimately satisfactory treatment should make reference to the notion of position class, and therefore should be handled in terms of an iteration of what Stump calls morpholexical functions. The way we want to approximate this generalization for our purposes here is in terms of the values of the PF function for other feature specifications, viz. the present subjunctive forms, reminiscent of, but not identical to, Zwicky’s and Stump’s “rules of referral” (cf. Zwicky 1985, Stump 1993).
55
In particular, as before, expressions such as “5 + -e” are supposed to be taken only as a first rough and ready characterization of the morphological process associated with a particular feature combination. It goes without saying that short of formally defining the operations such as “+” (concatenation) and the datatypes they operate on, these rules are strictly speaking without interpretation.
272
(86)
finite
1 2 3 4 = 5 + -e iff 1 = sg & 2 = 1st & 3 = pres ∨1 = sg & 2 = 3rd & 3 = pres & 4 = subj = 5 + -st iff 1 = sg & 2 = 2nd & 3 = pres &4 = ind = 5 + -t iff 1 = sg & 2 = 3rd & 4 = ind & 3 = pres ∨1 = pl & 2 = 2nd & 3 = pres & 4 = ind = 5 + -en iff 1 = pl & 2 = (1st ∨ 3rd) & 3 = pres finite 1 = PF 5 + -t-, otherwise 2 pres subj
PF 5,
6
Conclusion
The purpose of this paper has been twofold. First, by making certain kinds of information explicit which were assumed to be for the most part implicit in P&S (1994), it is argued that we arrive at a more principled framework for expressing covariation phenomena. On the other hand the kind of information that is grouped under (and , for that matter) is constrained by the requirement of morphological visibility, i.e., that no dimensions or variation are postulated that never surface in the morphology of any item of the lexical class in question. What this excludes in principle, for instance, is a simple identification of the agreement information on the finite verb with that of the subject. While the first contains a attribute inappropriate for the subject, the latter has a feature which does not trigger covariation with the verb. Future research will tell whether an approach that views the relation between agreement and morphological realization in a more uniform fashion is to be preferred.
References Andrews, Avery. 1982. The representation of case in modern Icelandic. In The Mental Representation of Grammatical Relations, ed. Joan Bresnan. Cambridge, MA: MIT Press. 427–503. Balari, Sergio (Ravera). 1992. Feature Structures: Linguistic Information and Grammatical Theory. Ph.D. dissertation, Autonomous University of Barcelona. Barlow, Michael, and Charles Ferguson. 1988. Agreement in Natural Language. Stanford: Center for the Study of Language and Information.
Agreement and the syntax–morphology interface in HPSG
273
Bird, Steven, and Ewan Klein. 1993. Enriching HPSG phonology. Research Paper EUCCS/RP-56, Centre for Cognitive Science, University of Edinburgh. Borsley, Robert. 1989. An HPSG approach to Welsh. Journal of Linguistics 25: 333–354. Calder, Jo. 1994. Feature-value logics: some limits on the role of defaults. In Constraint Propagation, Linguistic Description and Computation, ed. by Rod Johnson, Mike Rosner, and C. J. Rupp. New York: Academic Press. Carpenter, Bob. 1992. ALE user’s guide. Laboratory for Computational Linguistics Report CMU-LCL-92-1, Laboratory for Computational Linguistics, Carnegie Mellon University. Chierchia, Gennaro. 1989. Anaphora and attitudes de se. In Semantics and Contextual Expression, ed. Renate Bartsch, Johan van Benthem, and P. van Emde Boas. Dordrecht: Foris. Corbett, Greville. 1983. Hierarchies, Targets and Controllers: Agreement Patterns in Slavic. London: Croom Helm. Corbett, Greville. 1988. Agreement: a partial specification based on Slavonic data. In Agreement in Natural Language. Stanford: Center for the Study of Language and Information. 23–43. Corbett, Greville. 1991. Gender. Cambridge: Cambridge University Press. Dowty, David. 1996. Towards a minimalist theory of syntactic structure. In Discontinuous Constituency, ed. Harry Bunt and Arthur van Horck. Berlin, New York: Mouton de Gruyter. 11–62. Dowty, David, and Pauline Jacobson. 1988. Agreement as a semantic phenomenon. In Proceedings of the 5th Eastern States Conference on Linguistics. 1–17. Erjavec, Tomal. 1995. Unification, Inheritance, and Paradigms in the Morphology of Natural Languages. Ph.D. dissertation, University of Edinburgh. Hinrichs, Erhard, and Tsuneko Nakazawa. 1990. Subcategorization and VP structure in German. In Proceedings of the Third Symposium on Germanic Linguistics, ed. Shaun Hughes and Joe Salmons. Amsterdam: Benjamins. Kathol, Andreas. 1994. Passives without lexical rules. In German in Head-Driven Phrase Structure Grammar, ed. John Nerbonne, Klaus Netter, and Carl J. Pollard. Stanford: Center for the Study of Language and Information. 237–272. Kathol, Andreas, and Robert T. Kasper. 1993. Agreement and the Syntax-Morphology Interface in HPSG. Unpubl. MS, Ohio State University. Kim, Jong-Bok. 1993. On the Stucture of Korean Lexicon and Word Formation – A Constraint Based Approach. Presentation at the First International HPSG Workshop, August 2, Columbus, OH. Kim, Jong-Bok. 1994. A Constraint-Based Lexical Approach to Korean Verb Inflections. Unpublished MS, Stanford University. Kiss, Tibor. 1992. Infinite Komplementation: Neue Studien zum deutschen Verbum infinitum. Ph.D. dissertation, Universität/Gesamthochschule Wuppertal. Krieger, Hans-Ulrich. 1994. Derivation without lexical rules. In Constraint Propagation, Linguistic Description, and Computation, ed. Rod Johnson, Mike Rosner, and C. J. Rupp. New York: Academic Press. 277–313. Krieger, Hans-Ulrich, and John Nerbonne. 1992. Feature-based inheritance networks for computational lexicons. In Default Inheritance within Unification-based Lexicons, ed. Ted Briscoe, Ann Copestake, and Valerie de Paiva. Cambridge: Cambridge University Press. Lehmann, Christian. 1988. On the function of agreement. In Agreement in Natural Language, ed. Michael Barlow and Charles Ferguson. Stanford: Center for the Study of Language and Information. 55–65.
274
Manning, Chris, and Ivan Sag. 1995. Dissociations between Argument Structure and Grammatical Relations. Unpubl. MS, Carnegie Mellon University and Stanford University. Monachesi, Paola. 1993. On si constructions in Italian HPSG grammar. In Proceedings of the Tenth Eastern States Conference on Linguistics, ed. Andreas Kathol and Michael Bernstein. 223–234. Müller, Stefan. 1997. Spezifikation und Verarbeitung deutscher Syntax in Head-Driven Phrase Structure Grammar. Ph.D. dissertation, Universität des Saarlandes. Netter, Klaus. 1994. Towards a theory of functional Heads: German nominal phrases. In German in Head-Driven Phrase Structure Grammar, ed. John Nerbonne, Klaus Netter, and Carl J. Pollard. Stanford: Center for the Study of Language and Information. 297–340. Nunberg, Geoffrey. 1977. The Pragmatics of Reference. Ph.D. dissertation, CUNY Graduate Center. Nunberg, Geoffrey. 1993. Indexicality and deixis. Linguistics and Philosophy 16: 1–43. Pollard, Carl J. 1996. On head non-movement. In Discontinuous Constituency, ed. Harry Bunt and Arthur van Horck. Berlin, New York: Mouton de Gruyter. 279–305. Pollard, Carl, and Ivan A. Sag. 1987. Information-based Syntax and Semantics. Vol. 1, CSLI Lecture Notes Series no. 13. Stanford: Center for the Study of Language and Information. Pollard, Carl, and A. Sag. 1992. Anaphors in English and the scope of binding theory. Linguistic Inquiry 23: 261–303. Pollard, Carl, and A. Sag. 1994. Head-Driven Phrase Structure Grammar. Chicago: University of Chicago Press and Stanford: Center for the Study of Language and Information. Riehemann, Susanne A. 1993. Word Formation in Lexical Type Hierarchies – A Case Study of bar-Adjectives in German. Master’s thesis, University of Tübingen. Sag, Ivan, Lauri Karttunen, and Jeffrey Goldberg. 1992. A lexical analysis of Icelandic case. In Lexical Matters, ed. Ivan A. Sag and Anna Szabolcsi. Stanford: Center for the Study of Language and Information. 301–316. Spencer, Andrew. 1991. Morphological Theory. An Introduction to Word Formation in Generative Grammar. Oxford: Basil Blackwell. Stump, Gregory. 1991. A paradigm-based theory of morphosemantic mismatches. Language 67: 675–725. Stump, Gregory. 1992. On the theoretical status of position class restrictions on inflectional affixes. In Yearbook of Morphology, ed. Geert Booij and Jaap van Marle. Dordrecht: Kluwer. 211–241. Stump, Gregory. 1993. On rules of referral. Language 69: 449–479. Zwicky, Arnold. 1985. How to describe inflection. In Proceedings of the Eleventh Annual Meeting of the Berkeley Linguistics Society, ed. Mary Neipokuj, Mary Van Clay, Vassiliki Nikiforidou, and Deborah Feder. 372–386. Zwicky, Arnold. 1986a. German adjective agreement in GPSG. Linguistics 24: 957–990. Zwicky, Arnold. 1986b. Imposed versus inherent feature specifications, and other multiple feature markings. In IULC Twentieth Anniversary Volume. Boomington: Indiana University Linguistics Club. 85–106. Zwicky, Arnold. 1987. Phonologically Conditioned Agreement and Purely Morphological Features. Technical report. UCSC Syntax Research Center, Report SRC-87-06.
7. Partial VP and split NP topicalization in German: an HPSG analysis1 . University of Tübingen University of Tokyo
1
Introduction
In previous work (Hinrichs and Nakazawa 1989, 1993, 1994, forthcoming) we have presented an HPSG analysis of German VP structure that crucially relies on the notion of argument composition. The scope of that analysis was essentially restricted to those clause types which place the finite verb in sentence-final position. The purpose of this paper is to present an analysis for the remaining clause types of German, specifically for assertion main clauses. This requires an account of the two syntactic phenomena that characterize assertion main clauses: 1. an account of topicalization which places a single constituent in sentence-initial constituent, and 2. an account of the so-called - of the finite verb, which follows the topicalized constituent. The treatment of these two phenomena builds on recent work of Pollard (1990) and Nerbonne (1994), although it differs considerably in scope and detail. We will follow Pollard’s head nonmovement analysis of the finite verb in V2-position. However, we will depart from Pollard’s account when it comes to topicalization. Pollard allows any combination of flat structure and hierarchical structure among verb-complement structures in order to be able to account for cases in which verbal constituents are topicalized with some of their NP arguments while others remain in situ. By Pollard’s own admission, the resulting analysis has the undesirable property of introducing spurious
1
We are grateful for invaluable comments from: Tania Avgustinova, Kathy Baker, Thilo Goetz, Georgia Green, Josse Heemskerk, Tibor Kiss, Detmar Meurers, Michael Moortgat, Frank Morawietz, Jerry Morgan, John Nerbonne, Klaus Netter, Dick Oehrle, Karel Oliva, Carl Pollard, Frank Richter, Ivan Sag, Bernhard Schwarz, Mark Steedman, and two anonymous referees.
276
.
ambiguity on a massive scale. Instead, our account of topicalization will follow the program begun by Nerbonne (1994). Nerbonne accounts for cases in which an incomplete verbal constituent is topicalized not by postulating a corresponding constituent in nontopicalized position. Rather he treats such partial verbal constituents via a lexical rule that restricts their distribution to the topicalization construction. As the title suggests, the bulk of the paper will be devoted to an HPSG treatment of topicalization in German. In the first part we will discuss those phenomena previously discussed by Nerbonne, in particular topicalization of so-called (or PVPs for short). We will offer a lexical account of such phenomena which avoids some crucial shortcomings of Nerbonne’s analysis. In the second part we will broaden the range of data to other cases in which partial material is fronted, namely cases which have been referred to in the literature as split-NP topicalization. We will show that the same type of mechanism that can account for PVP topicalization can be employed for split-NP topicalization. Finally we will show how the interaction of the lexical rules for PVP topicalization and split-NP topicalization can work in parallel and offer an account of cases of topicalization which have been cited as open problems in the literature: namely topicalized PVPs in which one of the topicalized NP complements has a corresponding remnant in nontopicalized position. The resulting analysis will be presented in the version of HPSG outlined in chapter 9 of Pollard and Sag (1994) and builds upon their treatment of long-distance dependencies without traces. Accordingly we will use the feature geometry introduced in chapter 9 as much as possible. For the sake of uniformity, we will take the liberty of recasting older analyses, including our own, to bring them in line with the assumptions made in chapter 9.
2
Argument composition in the verbal complex
In previous work we presented an analysis of German VP structure that crucially relied on the mechanism of argument composition. For the ditransitive verb geben “give” and its nonsubject arguments in sentences such as (1) our analysis produced the tree structure shown in (2). (1)
Ich glaube nicht, daß Peter Maria das Buch geben können wird. I believe not that Peter Maria the book give can will “I don’t believe that Peter will be able to give Maria the book.”
Partial VP and split NP topicalization in German
277
(2) V[ +][ 〈 〉] [ +] NP
V[ +][ 〈 〉] [ +] NP
V[ +][ 〈NP〉] [ +] V[ +][ [1]〈NP,NP〉] [ −]
NP
V[ +] [ [1]〈NP,NP〉] [ −] Vword [ [1]〈NP,NP〉] [ −] Peter Maria das Buch geben
Vword [ +] [ append([1], 〈V[ [1]]〉)]
Vword [ +] [ append([1], 〈V[ [1]]〉)] können
wird
According to this analysis, the main verb geben in (2) first combines with the auxiliary verb können “can” before any nonverbal complements are added. We will refer to such constituents which contain only verbal categories and no NP complements as . The motivation for such a constituent structure comes from the fact that verbal complexes, i.e. main verbs together with auxiliaries, can be topicalized as in (3). Since topicalized elements are commonly assumed to be single constituents, this provides evidence for the constituenthood of geben können “give can” in (3). (3)
Geben können wird Peter Maria das Buch. give can will Peter Maria the book “Peter will be able to give Maria the book.”
In addition, the verbal complex serves as the domain over which auxiliaries can be fronted. This so-called construction positions finite auxiliaries such as wird “will” in (4) to the left in the verbal complex, instead of the customary sentence-final position for subordinate clauses.
278
.
(4)
Ich glaube nicht, daß Peter Maria das Buch wird geben können. I believe not that Peter Maria the book will give can “I don’t believe that Peter will be able to give Maria the book.”
As we showed in our earlier work, cases of auxiliary flip can be treated in terms of simple LP-statements if main verbs and auxiliaries form constituents that do not contain any NP complements. The proposed constituent structure requires that subcategorization information about nonverbal complements is propagated from the main verb to the top of the verbal complex. In the framework of HPSG, this can be achieved by structure sharing the complements of the main verb with the subcategorization information of each auxiliary in the sentence. This then leads to lexical entries for auxiliaries such as können “can” as shown in (5). (5) word können || verb bse | append [1] list ([||| ¬verb]), || verb bse | [1] −
Können requires a base infinitive verbal complement, as indicated in the value in (5). In the case of geben können “give can” shown in (2), this base infinitive verbal complement is instantiated in the syntax by the lexical entry for geben. The [ −] specification of the value in (5) specifies that the complement is either a lexical verb, or a verbal complex which dominates no NP complements. Lexical verbs are unspecified for the feature, and are therefore compatible with the [ −] specification required by the value while verbal complexes are specified to be [ −] by virtue of the Verbal Complex ID Schema below. The value also ensures that the subcategorization information of the governed verb appears in the value of the auxiliary itself, as indicated by tag [1]. In the case of geben können, this means that the dative and accusative NP complements that geben subcategorizes for are raised into the list of können. As shown in (2), this information is then propagated to the mother category of the local tree for geben können. This feature percolation is enforced by the HPSG Valence Principle. When geben können combines with wird “will,” as in (2), the value of geben können is once again passed on to the mother of the local tree for geben können wird “give can will” through the lexical category for wird “will” and the Valence Principle. In order to prevent verbal complements from being raised as well, the elements of the complement list marked by tag [1] carry the specification
Partial VP and split NP topicalization in German
279
[ ~verb]. Without this constraint, we would introduce multiple constituent structures for sentences which contain auxiliaries. For example, apart from the structure shown in (2), sentence (1) would also be analyzed as having a totally flat structure, since wird could raise the verbal signs which it governs plus the complements of the governed verbal signs. In such a constituent structure, the value of können would be instantiated as 〈NP,NP,V[ −]〉, i.e. contain geben and its nonsubject complements, and the value of wird would contain können and the value of können, i.e. 〈NP,NP,V[ −],V[ +]〉. The Verbal Complex ID Schema in (6) licenses local trees which consist of an head and its verbal complement. (6)
Verbal Complex ID Schema (preliminary version) || verb → ‒ H word, C[||| verb ]
Thus, together with the Verbal Complement ID Schema, the lexical specification of the value of auxiliaries ensures the desirable distribution of the subcategorization information of the main verb, while allowing the main verb and auxiliaries to form a constituent.
3
Multiple constituent structures
One issue not addressed in our previous work is the question whether the type of structure in which main verbs and auxiliaries form constituents that do not contain any NP complements is the only structure necessary. Johnson (1986a) and Pollard (1990) claim that a second class of structures is needed in order to account for the fact that main verbs and NP complements can be topicalized in sentences such as (7). (7)
Ein Gedicht auswendiglernen müssen die Schüler können. a poem memorize must the students be able to “The students must be able to memorize a poem.”
(7) exhibits the V2 word order of German assertion clauses: the finite verb müssen “must” follows the topicalized constituent ein Gedicht auswendiglernen “a poem memorize” in sentence-initial position. In order for ein Gedicht auswendiglernen to form a constituent in (7) or in the corresponding subordinate clause in (8a), the constituent structure in (8b) seems to suggest itself. (8)
a. daß die Schüler ein Gedicht auswendiglernen können müssen that the student a poem memorize be able to must “that the student must be able to memorize a poem”
280
.
V[ +][[1]〈 〉]
b.
H[ +][ [1]〈 〉]
NP
V[ +] [ [1]〈 〉] V[ [1]〈 〉] NP
H word[ +] [ append([1], 〈V[ [1]]〉)]
Hword [ +] [ append([1], 〈V[ [1]]〉)]
H word [〈NP〉]
die Schüler ein Gedicht auswendiglernen können
müssen
In general, two types of constituent structures are required then: those in which main verbs first combine with auxiliaries and then with their complements, and those structures in which the order of combination is reversed. However, this introduces multiple structures for subordinate clauses such as (8a) which have no independent syntactic or semantic motivation. In fact, this problem of spurious ambiguity becomes even more extensive if one follows the line of analysis taken in Pollard (1990). As Pollard himself points out, in his analysis the sentence in (9) is given six different constituent structures. (9)
Hat er seiner Tochter ein Märchen erzählen können? has he his daughter a fairy tale tell be able to “Could he tell his daughter a fairy tale?”
4
The analysis of Pollard (1990)
Pollard (1990) allows nonfinite verbs to combine with an arbitrary number of their NP complements via the ID schema, called Schema B, reproduced in (10). (10) Schema B (Pollard 1990) ||| 〈[1]〉 → 〈[2], . . . , [m]〉 word [m+1], . . . , [n], H ||| 〈[1]〉 〈[2], . . . , [n]〉 Following Borsley’s analysis of Welsh (Borsley 1987, 1989), Pollard assumes that nonfinite verbs have nonempty values. For finite verbs the subject
Partial VP and split NP topicalization in German
281
argument appears in the list, while the list itself is empty. Since the ID schema in (10) requires a nonempty value, the rule applies only to nonfinite verbs. The ID schema discharges complements [m+1] through [n] from the list of the head daughter and passes the remaining arguments [2] through [m] up to the mother. Pollard generates the finite verb as the head daughter of a clause in which all of the complements of the finite verb, including its subject, are realized in a flat structure. To this end, Pollard adopts an ID schema, called Schema C, which was originally introduced into the theory for the account of Subject– Auxiliary Inversion in English. (11) Schema C (Pollard 1990) ||| 〈 〉 → 〈 〉 word || verb H fin 〈 〉 〈[1], . . . , [n]〉
, [1], . . . , [n]
In addition to his Schemata B and C, Pollard adopts the argument composition approach for auxiliaries proposed in our earlier work. This means that NP arguments can either be discharged directly by the main verb via Schema B or be raised via argument composition when the NP-governing verb combines with an auxiliary. All in all Pollard’s analysis admits six different analyses for the same sentence which come about through different interactions of argument composition licensed by the auxiliary können “can” and argument discharge via Schema B. These analyses can be distinguished by the bracketings in (12). The bracketing in (12a) shows argument raising of both NP complements. In (12b) only the dative NP seiner Tochter “his sister” is raised, but no argument is raised in (12c). For (12c) Schema B combines the auxiliary with the governed verb erzählen “tell” and its NP complements in one flat structure. Yet another option that the analysis allows is for Schema B to apply only to erzählen and its NP complements. In (12d) seiner Tochter and ein Märchen “a fairy tale” combine with erzählen before the auxiliary können is added, while in (12e) and (12f ) erzählen combines only with its accusative NP before können is added. (12) a. b. c. d. e. f.
[Hat er seiner Tochter ein Märchen [erzählen können]] [Hat er seiner Tochter [ein Märchen erzählen können]] [Hat er [seiner Tochter ein Märchen erzählen können]] [Hat er [[seiner Tochter ein Märchen erzählen] können]] [Hat er seiner Tochter [[ein Märchen erzählen] können]] [Hat er [seiner Tochter [ein Märchen erzählen] können]]
282
.
Notice that analyses such as Johnson (1986a) and Pollard (1990) which suffer from the spurious ambiguity problem are based on one crucial assumption, namely that the topicalized constituent corresponds to some noninitial constituent in subordinate clauses. However, this assumption has been challenged in the literature, most recently by Kiss (1993) and by Nerbonne (1994). Kiss (1993) cites examples with split-NP topics as in (13) and (14). (13) a. Hier gibt es keine giftigen Schlangen. here gives it no[] poisonous[] snakes “There are no poisonous snakes here.” b. Giftige Schlangen gibt es hier keine. poisonous[] snakes gives it here no[] “There are no poisonous snakes here.” c. *Giftigen Schlangen gibt es hier keine. poisonous[] snakes gives it here no[] When the determiner keine “no” appears together with the adjective in nontopicalized position, as in (13a), then the determiner has to be in the strong form and the adjective in the weak form. However, when the adjective giftige “poisonous” and the noun Schlangen “snakes” are topicalized without a determiner, as in (13b), then giftige has to appear in the strong form. The weak form of the adjective becomes ungrammatical, as (13c) shows. Hence there is no strict identity between NPs in topic position and corresponding NPs in nontopic position. (14) shows that cases of ein–kein “a”–“no” doublets are admissible among split-NPs, as in (14a), but are not grammatical in nontopicalized position, as in (14b). (14) a. Ein Zufall ist das keiner. an[] accident is this no[] “This is no accident.” b. *Das ist keiner ein Zufall. this is no[] an[] accident Nerbonne (1994) points out yet another construction that is problematic for assuming structural correspondence. (15) a. Einen Hund füttern, der Hunger hat, wird wohl jeder a dog feed which hunger has will well everyone dürfen. may “Presumably everyone is allowed to feed a dog that is hungry.” b. *Es wird wohl jeder einen Hund füttern, der Hunger hat, it will well everyone a dog feed which hunger has dürfen. may
Partial VP and split NP topicalization in German
283
c. Es wird wohl jeder einen Hund füttern dürfen, der Hunger it will well everyone a dog feed may which hunger hat. has “Presumably everyone is allowed to feed a dog that is hungry.” Example (15a), which Nerbonne attributes to Tilman Höhle, involves an extraposed relative clause and an object NP in topicalized position. However, the same material cannot appear together in nontopicalized position as the ungrammatical (15b) shows. When the object NP is not topicalized, the relative clause must be extraposed, as in (15c). In fact, Nerbonne (1994) presents a treatment of topicalization that gives up this assumption of structural correspondence and that manages to avoid the problem of spurious ambiguity that Pollard’s analysis suffers from. The main insight of Nerbonne’s account of topicalization is that PVPs, that is verbs together with some of their NP complements, form constituents only in topicalized position. This is accomplished by a lexical rule which places PVPs into the -value of finite verbs and discharges them in topicalized position. Our own treatment of topicalization adopts Nerbonne’s idea, although it differs significantly in detail and avoids some problematic aspects of Nerbonne’s original account. As a starting point we will adopt the flat-structure for the finite verb and its complements that has been proposed by Pollard (1990). For the subordinate clause in (1), repeated below, Pollard’s treatment assigns as one admissible structure the tree in (16). (1)
Ich glaube nicht, daß Peter Maria das Buch geben können wird. I believe not that Peter Maria the book give can will “I don’t believe that Peter will be able to give Maria the book.”
(16)
NP
V[ +][ 〈 〉] NP
NP
V[ +] [ [1]〈NP,NP〉] Vword [ [1]〈NP,NP〉]
Peter Maria das Buch geben
Vword [ +] [ append([1], 〈V[ [1]]〉)]
Vword [ +] [ append([1], 〈V[ [1]]〉)] können
wird
One of the main advantages of the type of structure that (16) exemplifies is that it allows a nonmovement analysis of auxiliary flip and of the placement of the finite verb for the different clause types of German. We will return to this point in detail in section 8 below.
284
.
Because of the spurious ambiguity problem, Pollard’s account produces quite a few additional structural analyses of the same clause. In our analysis (16) will turn out to be the only admissible analysis. This means that we will always force a binary hierarchical structure for the verbal complex via the ID schema in (6) and a flat structure for the finite verb and all of its complements via a revised version of Pollard’s Schema C given in the next section.
5
PVP topicalization
Given the type of structure which is exemplified by the tree (16), the question that immediately arises is how we can account for the full range of topicalization possibilities in German. The choices for the topicalized sentence-initial constituent include single NP constituents, as in (17a), or single verbal constituents, as in (17b)–(17f ), as long as such constituents contain main verbs. (17) a. Ein Märchen wird er seiner Tochter erzählen. a fairy tale will he his daughter tell “He will tell his daughter a fairy tale.” b. Seiner Tochter ein Märchen erzählen wird er. his daughter a fairy tale tell will he “He will tell his daughter a fairy tale.” c. Ein Märchen erzählen wird er seiner Tochter. a fairy tale tell will he his daughter “He will tell his daughter a fairy tale.” d. Ein Märchen erzählen wird er seiner Tochter müssen. a fairy tale tell will he his daughter must “He will have to tell his daughter a fairy tale.” e. Erzählen wird er seiner Tochter ein Märchen. tell will he his daughter a fairy tale “He will tell his daughter a fairy tale.” f. Erzählen müssen wird er seiner Tochter ein Märchen. tell must will he his daughter a fairy tale “He will have to tell his daughter a fairy tale.” In (17a) the accusative complement ein Märchen “a fairy tale” is topicalized. In example (17b), a VP is topicalized while in (17c–d), a PVP is topicalized leaving some of its NP complements and an auxiliary in sentence-final position. The remaining examples show topicalization of verbal material only: (17e) shows topicalization of a main verb alone; (17f ) exemplifies topicalization of a verbal complex with a main verb and an auxiliary. Following Nerbonne (1994) we will account for these topicalization data by adopting the Complement Extraction Lexical Rule of Pollard and Sag (1994). Their rule is reproduced in (18).
Partial VP and split NP topicalization in German
285
(18) Complement Extraction Lexical Rule (Pollard and Sag 1994) ||| 〈 . . . , [ [1]], . . . 〉 → || { } ||| 〈 . . . 〉 || {[1]} This lexical rule has the effect of removing an element from the list and placing the value for the feature onto the list, instead. The rule in (18) accounts for the case of topicalization of the accusative NP complement in (17a), if (18) is instantiated for the main verb erzählen “tell” and the value is instantiated to be of category NP[ acc]. Consider next the topicalization of a full verb phrase as in (17b). The Complement Extraction Lexical Rule in (18), as it stands, will not suffice to account for sentences like (17b). In order to generate (17b), the lexical rule (18) would have to apply to the lexical entry of the auxiliary wird “will.” As shown in (5), auxiliaries raise all complements of the verbs they govern. Therefore, the fronted material seiner Tochter ein Märchen erzählen “his daughter a fairy tale tell” does not constitute a single entry in the list of the lexical entry for wird. As a result, (18) can only extract a raised NP complement or a verbal complement which is instantiated to be a main verb alone or a verbal complex. But (18) cannot account for topicalization of a full VP as in (17b) or for the cases of PVP topicalization in (17c) and (17d). We, therefore, introduce an additional lexical rule to account for the topicalization of partial or full verb phrases. This rule is given in (19). (19) PVP-Topicalization Lexical Rule2 | verb + | append
[2], | verb [1]
|| { }
||| [6] list ([||| ¬verb
|| verb [1]
where same-member ([5], [6]). 2
→
[2] [3] [4]
〈 〉 [3] + [4] || [5]
As formulated by Pollard and Sag (1994), the feature takes a set of the values rather than a set of the entire signs. The PVP Topicalization Rule in (19), however, requires
286
.
The rule places a verbal constituent with a saturated list on the set of an auxiliary. In addition, the slashed verbal constituent has itself a value [5] whose members are shared by the value [6] of the auxiliary. The relation - indicates that the pair of the set value [5] and the list value [6] have the same members. This condition ensures that the list of NPs that are missing from the topic appears on the list of the auxiliary and is therefore realized in nontopicalized position. In order to prevent potential interactions with other extraction lexical rules, the || value of the input category is taken to be empty. For the topicalization of full VPs the value [6] of the auxiliary in (19) is instantiated to be the empty list, as can be seen in tree (20) for example (17b). V[ 〈 〉]
(20)
[7]V[ 〈 〉][ [5]{ }]
NPdat
NPacc
H[ +][ fin] [ 〈 〉][ {[]}]
H[ H[ +][ fin] NPnom 〈NPdat,NPacc〉] [ [6]〈 〉] [ [5]{ }] [ {[]}]
seiner Tochter ein Märchen erzählen
wird
er
The topicalized constituent is placed on the value of the auxiliary wird “will” and is licensed by the Filler-Head ID Schema. In order to be able to generate seiner Tochter ein Märchen erzählen “his daughter a fairy tale tell” in topicalized position, we assume the revised version of Schema C shown in (21). (21) Head-Complement ID Schema (Revised Schema C) || verb | 〈 〉 → C*, Hword + This Head-Complement ID Schema can apply not only to finite VPs, as in the analysis of Pollard (1990), but to nonfinite VPs as well. This generality is achieved by leaving the value for the feature unspecified on the head-daughter. Thus, (21) licenses the local trees for both the topicalized material seiner Tochter ein Märchen erzählen and for the nontopicalized part wird er in (20). In fact, once Schema C has been modified in this way, it is the only entire signs in the value since the element includes an instantiated value, as well as the value as shown in the following sections. In general, we assume that the value is a set of signs. We also take the value of to be a list of signs, rather than a list of synsem objects (as in Pollard and Sag 1994). This revision is also motivated by our treatment of split-NP topicalization that is given in section 10.
Partial VP and split NP topicalization in German
287
ID schema which we need for structures that consist of a verbal head and NP complements in German. While the Head-Complement structure licensed by the above ID schema may be either finite, as is the case of assertion main clause, or nonfinite as in the case of topicalized (partial) VPs, verbal complexes are always nonfinite since the finite verb does not form part of a verbal complex under the proposed analysis. The Verbal Complex ID Schema originally proposed in (6) is revised to admit only a nonfinite verbal head as in (22). (22) Verbal Complex ID Schema (final version) || verb → − , [||| verb ] word |||| ¬fin Next, example (17c), which contains a PVP topic, exhibits a more interesing case for which the PVP-Topicalization Lexical Rule (19) is needed. The analysis tree for (17c) is given in (23). [ 〈 〉]
(23) [7][ 〈 〉] [ {[5]NPdat}] NPacc
[ +][ fin] [ 〈 〉][ {[]}]
NPnom [5]NPdat [+][ fin] [ 〈NPacc〉] [ {[5]NPdat}] [ 〈[5]〉] [{[]}]
ein Märchen
erzählen
wird
er seiner
Tochter
Lexical Rule (18) generating the lexical entry for erzählen “tell” erzählen erzählen → |||| 〈[5]NPdat, NPacc〉 ||| 〈acc〉 || {[5]NPdat} Lexical Rule (19) generating the lexical entry for wird “will” wird | verb + | append [2] list ([||| ¬verb]), | || { } wird ||| 〈[5] NPdat〉 ||
[7]
→
verb [1] | [2]
| verb [1] | 〈 〉 + || {[5] NPdat}
288
.
Notice that the topicalized constituent is saturated in the sense that its list is empty. Consequently it can be generated by the Head-Complement ID Schema in (21) if we apply the Complement Extraction Lexical Rule (18) which we need for NP complement extraction, anyway, to the lexical entry of the ditransitive verb erzählen. Notice also that the complements of erzählen “tell” are split between the list of wird and the list of the main verb erzählen. Seiner Tochter “his daughter” is realized on the list of wird and is generated in nontopicalized position of the clause. Ein Märchen “a fairy tale” appears on the list of the main verb and is realized together with the main verb in topicalized position. Due to indices [5] and [6] of the PVP-Topicalization Lexical Rule (19), the NP complement which is generated in nontopicalized position also appears on the set of the topicalized main verb. That is, the topicalized VP is partial in the sense that it is missing a dative NP. There is one more detail to explain about the PVP-Topicalization Lexical Rule, namely the distribution of the | and -| values that is illustrated in (23).3 The topicalized constituent ein Märchen erzählen “a fairy tale tell” has as its value the dative NP seiner Tochter “his daughter,” that is missing from the topic. This value may not be passed up from the topic node to the mother category for the sentence as a whole. This can be achieved in HPSG through the interaction of the Nonlocal Feature Principle and the two features | and -|. The Nonlocal Feature Principle specifies the | value of the mother as the set union of the | values of all daughters minus the -| value of the head daughter. Notice that the Nonlocal Feature Principle takes only those -| features into account that are specified on the head daughter. Since the value on the filler in (23), i.e. the filler’s | set, needs to be bound, this value has to appear on the -| feature of the head daughter in the Filler-Head structure. This is achieved via index [2] in the Filler-Head ID Schema given in (24). (24) Filler-Head ID Schema sign , [ phrase] → [1] ||| [2] sign | verb fin H 〈 〉 〈 〉 |{[1]} -| {[1]} ∪ [2] 3
Our thanks go to Carl Pollard (p.c.) for the treatment of the values presented here. His constructive criticism of our earlier idea gave us invaluable insights in formulating the current version.
Partial VP and split NP topicalization in German
289
The head daughter in the Filler-Head ID Schema takes as the value of the -| feature the set union of its own | value [1], i.e. the topic that needs to be bound off, and of the | value [2] of the topic which contains the material missing from the topic. The Nonlocal Feature Principle will then guarantee that these two values are not passed up to the mother of the filler-head structure. So far we have considered PVP topics whose NP complements are split between the topic and the rest of the clause. Let us consider next examples such as (17e–f ) in which only verbal material is topicalized. In (17e) a main verb is fronted alone. In order for the Filler-Head ID Schema to license such a topic, the filler daughter cannot be restricted to phrases, but must admit lexical signs such as main verbs as well. This is why the filler daughter in the ID schema in (24) is of sort sign, which subsumes both lexical and phrasal signs.4 However, if lexical signs are admitted, then our grammar is in danger of overgeneration with respect to ungrammatical examples such as (25). (25) *Müssen wird er seiner Tochter ein Märchen erzählen. must will he his daughter a fairy tale tell (25) shows that not any lexical sign can be fronted in German. Bare auxiliaries, for example, do not make well-formed topics. (25) would be admitted if the Complement Extraction Lexical Rule in (18) was applied to the lexical entry for wird in such a way that it places the lexical sign for the verb that it governs, i.e. müssen, on its list. However, (25) can be ruled out if we revise the Complement Extraction Lexical Rule in (18) as in (26). (26) Complement Extraction Lexical Rule (revised version) ||| 〈 . . . , [1][phrase], . . . 〉 → || { } ||| 〈 . . . 〉 || {[1]} (26) limits the value of to phrasal signs. But now haven’t we thrown out the baby with the bathwater since the revised lexical rule will not permit the extraction of a main verb either so that examples such as (17e) cannot be accounted for? Fortunately, the PVP-Topicalization Lexical Rule can come to the rescue. 4
The Filler-Head ID Schema in (24) also allows the head daughter to be either lexical or phrasal. In the proposed analysis, the main clause in German may consist of a single lexical verb as in the following example where the saturated finite verb lacht “laughs” appears with the topic NP Anna: (i)
Anna lacht. Anna laughs. “Anna laughs.”
290
.
In order to place erzählen “tell” in the topic position in (17e) the auxiliary wird “will” has to undergo the PVP-Topicalization Lexical Rule. The instantiation of the PVP that is necessary for (17e) is shown in (27). (27)
word wird || verb fin + | append
→
[2]list ([||| ¬ verb]), | verb [1] bse | [2] ‒ [3]
word wird | verb fin + | 〈[4]NPdat,[5]NPacc〉 ‒ ||
[6]
verb [1] bse | 〈 〉 + [3] || {[4]NPdat,[5]NPacc}
The lexical rule moves the verbal complement from the list into the set of wird. For (17e) this verbal complement is then instantiated to be the main verb erzählen. The NP complements of erzählen indicated by tags [4] and [5] are placed on the list of wird and are consequently instantiated in nontopicalized position as sisters of wird. The resulting analysis for sentence (17e) as a whole is shown in (28). V[ 〈 〉]
(28)
[6]V[ 〈 〉] [ {[4]NPdat,[5]NPacc}]
H[ +][ fin] [ 〈 〉][ {[6]}]
H[ +][ fin] NPnom [4]NPdat [ 〈[4]NPdat,[5]NPacc〉] [ {[6]}] erzählen
wird
er
[5]NPacc
seiner Tochter ein Märchen
The PVP ensures that the two NP complements [4] and [5] that are saturated as sisters of wird are missing from the topic node [6] for erzählen. Thus, the set of erzählen has these two elements. In order to be able to generate
Partial VP and split NP topicalization in German
291
PVP topics that are missing more than one NP, we have to make one more revision to the Complement Extraction Lexical Rule in (26). (29) Complement Extraction Lexical Rule (final version) ||| 〈 . . . , [1][phrase], . . . 〉 || [2] ||| 〈 . . . 〉 || [2]