Explorations in Seamless Morphology
Explorations in Seamless Morphology
Edited by
Rajendra Singh and Stanley Staros...
75 downloads
2755 Views
877KB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Explorations in Seamless Morphology
Explorations in Seamless Morphology
Edited by
Rajendra Singh and Stanley Starosta in collaboration with Sylvain Neuvel
SAGE Publications New Delhi n Thousand Oaks n London
Copyright © Rajendra Singh and Stanley Starosta, 2003
All rights reserved. No part of this book may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording or by any information storage or retrieval system, without permission in writing from the publisher.
First published in 2003 by Sage Publications India Pvt Ltd B-42, Panchsheel Enclave New Delhi 110 017 Sage Publications Inc 2455 Teller Road Thousand Oaks, California 91320
Sage Publications Ltd 6 Bonhill Street London EC2A 4PU
Published by Tejeshwar Singh for Sage Publications India Pvt Ltd, typeset at InoSoft Systems in 10 pt Nalandabaskar and printed at Chaman Enterprises, Delhi. Library of Congress Cataloging-in-Publication Data Explorations in seamless morphology/edited by Rajendra Singh and Stanley Starosta with the assistance of Sylvain Neuvel. p. cm. Includes bibliographical references and index. 1. Grammar, Comparative and generalóMorphology. I. Singh, Rajendra, 1943ñ II. Starosta, Stanley. III. Neuvel, Sylvain. P241.E94
415ódc21
ISBN: 0ñ7619ñ9594ñ3 (US-Hb)
2003
2002151158
81ñ7829ñ223ñ8 (India-Hb)
Sage Production Team: Zarin Ahmad, Sushanta Gayen and Santosh Rawat
for Mana, the grand-daughter of Stanley Starosta (1939ñ2002)†
†Stanley Starosta, the author of The Case for Lexicase (1988) and, more importantly, a friend of the other editor and grandfather of Mana, died of a heart attack last year. He will always be remembered for his penetrating insights and his integrity.
Contents Acknowledgements Introduction Rajendra Singh and Stanley Starosta
9 11
1. Prolegomena to a Theory of Non-Paninian Morphology Alan Ford and Rajendra Singh
18
2. Some Advantages of Linguistics without Morpho(pho)nology Alan Ford and Rajendra Singh
43
3. In Praise of ›akatåyana: Some Remarks on Whole Word Morphology Rajendra Singh and Alan Ford
66
4. On So-called Compounds Rajendra Singh and Probal Dasgupta
77
5. On Defining the Chinese Compound Word: Headedness in Chinese Compounding and Chinese VR Compounds Stanley Starosta, Koenraad Kuiper, Siew Ai Ng and Zhiqian Wu
90
6. Do Compounds have Internal Structure? A Seamless Analysis Stanley Starosta
116
7. Micronesian Noun Incorporation: A Seamless Analysis Stanley Starosta
148
8 ! Contents
8. Semantic Fragmentation in Word-Formation: The Case of Spanish -AZO Franz Rainer
197
9. Towards a Universal Theory of Shape-invariant (Templatic) Morphology: Classical Arabic Re-considered Robert R. Ratcliffe
212
10. Paradigmatic Morphology Thomas Becker
270
11. The Importance of Being Ernist Probal Dasgupta
284
12. A Perfect Strategy for Latin Byron W. Bender
301
13. Morphology in Minimal Information Grammar Danko ºSipka
328
About the Editors and Contributors
339
Index
341
Acknowledgements We are grateful to the following for permission to reproduce papers originally published by them: 1. Linguistic Analysis for P. Dasgupta, ëThe Importance of Being Ernistí. 2. Mouton de Gruyter, Berlin for Starosta et al., ëOn defining the Chinese compound wordí. 3. John Benjamins, Amsterdam, the publishers of the original French version of Ford and Singh, ëLinguistics without Morphophonologyí. 4. Folia Linguistica, the publishers of the original French version of A. Ford and R. Singh, ëProlegomena to a theory of Non-Paninian morphologyí.
Introduction ! 11
Introduction Rajendra Singh and Stanley Starosta I Seamless morphology has multiple origins, going back as far as Bhartrihari (8 C.E.). In recent times it has taken form independently in several places in Europe and North America. In the case of the versions espoused by the two editors of this volume, the seamless view evolved from two different directions. Rajendra Singh, Alan Ford, and several students of theirs at the Université de Montréal came from the direction of phonology, and were addressing the question of the boundary between phonology and word structure, and the status of morphophonemics/morphonology in grammatical theory. Their first proposal regarding the nature and structure of morphology is contained in Ford and Singh (1983), a paper they presented at the CLS special parasession on interfaces. In the lexicase version of dependency grammar that was developed at the University of Hawaiíi by Stanley Starosta and some of his students and colleagues on the other hand, beginning with Harvey Taylorís dissertation on Japanese in 1971, seamless morphology was the unplanned offspring of an attempt to constrain the power of Chomskyan syntactic theory. This produced a monostratal syntactic dependency representation called ëlexicaseí which operated exclusively with words rather than smaller units. The lexicon contained only words, and grammatical statements about the correlation between word shapes and syntactic distribution took the form of operations that created new words from lexically listed ones. The shared assumptions and principles of the lexicase and Montréal approaches to seamless morphology will be readily apparent in the papers in this volume authored or co-authored by Singh and Starosta, and more generally but with some minor differences by all the contributors to this volume.1 They include the following:
12 ! Rajendra Singh and Stanley Starosta
(i) Grammars operate with words. Only words are stored in the lexicon, not parts of words (stems, roots, affixes) and not chunks larger than individual words (e.g., phrasal idioms). There are no grammatical operations that construct words by combining other words or by combining syntactically nonoccurring fragments of words. Instead, regular correlations between the shapes of sets of words and their distributions are formalized by declarative non-directional analogical patterns called Word Formation Strategies (WFSs), as in Ford and Singhís work. (ii) There is no word-internal syntax. Words do not have any internal non-phonological structure. An element analyzed as a minimal syntactic unit in a given sentence cannot contain subcomponents delimited by square brackets or other non-phonetically motivated boundary elements. Words have no (non-phonological) hierarchical structure in which one subcomponent of the word is the ëheadí and the remaining part or parts are subordinate to it. They are, in essence, seamless wholes. It would be rather surprising if all the seamless approaches had converged on exactly the same target, and as the contributions to this volume demonstrate, they have not quite. The Montréal and lexicase versions, for example, differ in at least the following respects. (i) Whereas the phonological component of the lexicase framework is still in the programmatic stage, the Montréal version deliberately leaves questions of syntax and semantics open at this stage. The former assumes that words are composed of autonomous phonemes related to each other by pairwise dependency links which produce syllables and David Stampeís ësonablesí as an epiphenomenon. It cannot refer to ëalternationsí in formulating phonological generalizations because no way has been found within this approach to define alternations without reference to disallowed segmental morphemes. By contrast, the Montréal phonological representations on which WFSs operate are not conventional autonomous phonemic representations. Rather, they are required not to contain automaticitygoverned (= attributable to automatic alternations) phonic
Introduction ! 13
information. All information which can be predicted by general, category-independent phonological rules, rules that may be postulated to account for automatic alternations, is extracted from the phonological representations of entries in the lexicon, and phonology has the job of restoring these minimal representations to the level of pronounceability. The English ëplural sí will serve as an example. Montréal WFSs will refer to forms such as [kætz] ëcatsí and [h rsz] ëhorsesí. Impossible codas such as [tz] and [sz], whether associated with WFSs or not, will be rendered pronounceable by phonology, which uses only phonotactic well-formedness conditions and (phonotactically motivated) repair mechanisms of a general sort. The lexicase description of this way of marking the plural in English, on the other hand, would require four separate WFSs to account for these patters, one each for [kæt] ëcatí: [kæts] ëcatsí, [d g] ëdogí: [d gz] ëdogsí, [h rs] ëhorseí: [h rsiz] ëhorsesí, and [fli] ëfleaí: [fliz] ëfleasí, and would have no deviance to repair. The phonology in this view will simply spell out what are traditionally regarded as allophonic details. (ii) The ëmorphological categoriesí referred to in the Montréal WFSs are intended to be seen as bundles of features. They tend to be rather traditional, while those in their lexicase counterparts are more elaborate and embedded in an extensively tested syntactic framework. For example, the following is a partial lexicase WFS for English past participles: +V ñspct
:
+V ñfint +spct +prfc
Salv] Cvl] Xvd]
: : :
Salv d] Cvlt] Xvdd]
bat slap grab
: : :
batted slapped grabbed
]
:
n]
beat
:
beaten
: :
] ]
ring hit
: :
rung hit
I
]
]
l
The WFS for English passive verbs then applies to the transitive subset of past participles:
14 ! Rajendra Singh and Stanley Starosta
+trns +spct +prfc n AGT aFi
:
ñtrns +mode +pasv +prfc +spct n MNS aFi
The passive rule needs no shape component since English passive verbs are identical in shape to past participles. The Montréal approach does not recognize any syntactic or morphological structure as the English passive and sees what is generally referred to as the passive as simply one of the interpretations of a subclass of past-participles (cf. Langacker 1990, Langacker and Munro 1975, and, importantly, Cuyckens and Zawada 1997). It does not invoke even (morphological) identity in the case at hand. There is, in other words, no WFS involved here in the Montréal approach. (iii) Outside of synchronic description, the Montréal seamless approach has been employed primarily in psycholinguistic work, in particular, in language acquisition (cf. Singh and Martohardjono 1988) and speech errors (cf. Martohardjono and Singh 1992), while lexicase has been used in studies of grammatical subgrouping and reconstruction. (iv) As for synchronic descriptions, lexicase has been utilized especially in descriptions of Pacific and East, Southeast, and South Asian languages, while the Montréal version has been applied in particular to studies of Western European, IndoAryan, Native American, and Semitic languages. (v) The Montréal approach takes a strong separationist position. The rules of semantic interpretation with which Montréal WFSs are or can be associated are not incorporated into WFSs. Convinced of the inadequacy of the reductionist description of lexical meaning, Ford and Singh reject the idea that a semantic description can be reduced to a bundle of features or semantic markers, and anticipate describing and characterizing semantic structures relative to mental spaces, scenarios etc., and not in terms of markers and features of the sort popularized by Katz and Fodor (Katz
Introduction ! 15
and Fodor 1963). The Hawaiíian version, on the other hand, takes a non-separationist position and does not limit the kinds of lexical information which may appear in WFSs. Thus grammatically relevant lexically marked semantic properties may appear in a WFS if they are a regular part of the analogical relationship between two sets of words.
II For convenience, this collection can be and has in fact been divided into several groups of papers. The first group (Chapters 1 to 4) consists of four papers, the first three of which provide a general picture of Ford and Singhís theory of morphology. The first of these (Prolegomena) outlines the problems with traditional morpheme-based accounts of morphology and outlines their theory. The second piece recapitulates some of the main points of their theory, extensively argues in favour of incorporating morphonology directly into morphology, and shows precisely how that incorporation must be done. As doubts regarding the statability of different kinds of morphological processes within their framework have sometimes been expressed in the literature, their brief homage to Sakatyayana shows precisely how the diversity of morphological processes can be naturally accommodated and happily expressed within the framework of their theory. The last piece of this group, by Singh and Dasgupta, shows how one of the allegedly problematic cases, the case of compounds, is also naturally statable within that framework, albeit with important descriptive implications. The second group of papers contains Starostaís response, articulated with the help of several colleagues in the case of Chinese compounds, to the question of the internal structure of compounds and of verbs said to have incorporated nouns. The exploration of compounding in Chinese is an empirically rich challenge to the traditional view of compounding in that language. In the more general paper on the internal structure of compounds Starosta demonstrates that, pace Anderson (1992), the properties of such words do not in fact necessitate the Andersonian weakening of the constraint against word-internal grammatical structure. Starostaís re-examination of noun incorporation in Micronesian languages continues that thread and offers a thoroughly seamless account of this phenomenon.
16 ! Rajendra Singh and Stanley Starosta
The papers in the third group, by Rainer and Ratcliffe, offer penetrating critiques of the Gesamtbedeutung analysis of morphologically complex words. Whereas Rainer shows that diachronic evidence flies in the face of such analyses, Ratcliffe provides a convincing immanent critique of the prosodic morphology avatar of that view. As Semitic is generally taken to provide particularly strong arguments for abstract roots and other Påƒinian objects of wonder, Ratcliffeís paper constitutes an important argument for the efficacy of word-based models of morphology. If even Arabic morphology is word-based, as he argues, the Påƒinian view does not have much to support itself. In his detailed and empirically rich immanent critique, Ratcliffe argues, with remarkable clarity and elegance, that to say that the data of Semitic languages have for centuries been analyzed in terms of a grammatical theory in which the notions of (consonantal) roots and (vocalic!syllabic) patterns play a central role is not to say that these languages have root and pattern morphology. His exposé would undoubtedly have pleased Wittgenstein. The fourth group of papers consists of responses, by Becker, Bender, and Dasgupta, to the view advocated by Ford and Singh. While Dasgupta concentrates on the cutting required by Gesamtbedeutung, Becker responds from the point of view of paradigmatic morphology. Although Bender accepts and demonstrates (for Latin) the usefulness of WFSs, he argues for the indispensibility of the notion or construct ëparadigmí, argued out of theoretical existence in Ford, et al. (1997). The last paper, by ºSipka, shows how an efficient MIG grammar can be constructed without invoking notions and categories that Påƒinian morphology routinely invokes. Although ºSipka makes no psychological claims, the efficiency of his computational model, which, we repeat, does not invoke traditional sub-motic units, should give a pause to those who have made Påƒinian atomism their religion. In the five years in which the Montréal and Hawaiíi approaches to whole-word morphology have been actively interacting, the similarities and differences listed above have been identified and clarified, and a certain amount of terminological standardization and convergence has taken place. This tendency is likely to continue as additional unmotivated seams in the overall weave are identified and eliminated. We see this volume as broadening and accelerating that process, and look forward to further clarifications and convergence in the near future.
Introduction ! 17
Note 1. The Ford and Singh approach has also been called Projection Morphology, as in Singh and Martohardjono (1988) and Martohardjono and Singh (1992), because it sees morphology as a projection from a given mental lexicon, and Network Morphology, as in Singh and Agnihotri (1997) because it sees all related words as constituting a network. They have finally settled on Whole Word Morphology (cf. Singh and Ford 2000, reprinted as Chapter 3 of this volume, Neuvel and Singh 2001, and Singh and Neuvel [forthcoming]).
References Cuyckens, Hubert and Britta Zawada. 1997. Polysemy in Cognitive Linguistics. Amsterdam: Benjamins. Ford, Alan and Rajendra Singh. 1983. ëOn the status of morphophonologyí, in Richardson et al. ed., The Interplay of Phonology, Morphology, and Syntax. Chicago: Chicago Linguistic Society (CLS). Ford, Alan, Rajendra Singh and G. Martohardjono. 1997. Pace Påƒini. New York: Peter Lang. Katz, J.J. and J.A. Fodor. 1963. ëThe structure of a semantic theoryí. Language, 39: 170ñ210. Langacker, R.W. 1990. Concept, Image, and Symbol: The Cognitive Basis of Grammar. Berlin: Mouton. Langacker, R.W. and P. Munro. 1975. ëPassives and their meaningsí. Language, 51: 789ñ830. Martohardjono, G. and Rajendra Singh. 1992. ëSpeech errors and integral listingí. Linguistische Berichte. Neuvel, Sylvain and Rajendra Singh. 2001. ëVive la difference. What Morphology is really aboutí, Folia Linguistica. Singh, Rajendra and R.K. Agnihotri. 1997. Modern Hindi Morphology. Delhi: Motilal Banarasidass. Singh, R. and G. Martohardjono. 1988. ëIntermorphology and morphological theoryí, in W. OíNeil and S. Flynn eds, Linguistic Theory and Second Language Acquisition, pp. 362ñ83. Dordrecht: Kluwer. Singh, Rajendra and Sylvain Neuvel. Forthcoming. ëWhen the whole is smaller than the sum of its parts: The case of Morphologyí, Proceedings of the Chicago Linguistics Society 2002.
18 ! Alan Ford and Rajendra Singh
1 Prolegomena to a Theory of Non-Påƒinian Morphology1 Alan Ford and Rajendra Singh ëBut it is accomplished because when a suffix is mentioned we understand that which begins with the form to which the suffix is added and that which ends with the suffix.í óKåtyåyana (c. 300 B.C.), vårttika 7 in Benson (1990: 40) ëOne cannot claim that even if the adhikåra rule ìpratyayeî is stated and a proper limitation is effected through the implication of an a∆ga which is based on mutual concomitance in the manner described, the attack which concludes with the quotation of the paribhå¶å is fruitless. The reason for this is that although there exists an abundance of the a∆ga when the suffix is added, the expectancy of it is absent when we have knowledge of the suffix and therefore there are no valid grounds for accepting an implication of the a∆ga in the grammar.í óNåge‹a in Benson (1990: 123) ëWhen I say: ìMy broom is in the cornerî,óis this really a statement about the broomstick and the brush?í óLudwig Wittgenstein (1953: 29) Morphology is the study of formal relationships between words. Words are, however, not given in advance: they are determined by the theory. In the theory proposed here, an expression must possess the following three properties in order to be a word: (i) a phonological structure; (ii) a category; (iii) a meaning.
Prolegomena to a Theory of Non-Påƒinian Morphology ! 19
Whether these properties are sufficient or not to define the word is a question that we shall leave open. They are necessary but insofar as we wish to distinguish between syntax (formal relationships between units other than the word) and morphology, they are probably not sufficient. The theory of morphology defended here is based on the following hypothesis: any morphological relationship between a nonunique pair of words of a language can be described by a rule, to be called a morphological strategy (MS) or a Word Formation Strategy (WFS) having the following form: (1) [X]α ↔ [X′]β where: (i) X and X¢are words; (ii) a and b are morphological categories; (iii) « indicates an equivalence relation (a bi-directional implication); (iv) X¢is a semantic function of X; (v) ¢indicates a formal difference between the two elements of the relation of the morphological operation; (vi) ¢can be null if a ¹ b. The morphological relation between the two French words gros [gro], grosse [gros], for example, is given by the following strategy: (2) [X]Adj.
masc.
« [Xs]Adj.
fém.
We call this rule a strategy because as a generalization drawn from a particular fact, it can be activated in the production or understanding of new words. A morphological strategy, in other words, captures the morphological relatedness amongst the words that happen to be in a lexicon and allows a speaker to morphologically analyze a word he may not have analyzed before or to create a new word that he may not have in his lexicon or may have forgotten temporarily. The listing of the morphological strategies of a language constitutes a part of the description of that language. It is, therefore, an aspect of linguistic competence, a component of grammar.2 Morphological strategies alone do not account for all formal relationships. They sometimes get a hand from phonology. Each time an aspect of a formal relationship is attributable to phonology, which
20 ! Alan Ford and Rajendra Singh
covers all and only global, automatic alternations governed by phonotactics, that aspect is suppressed in the morphological strategy (Ford and Singh 1983 and Singh 1987). The relationship between French chanteur and chanteuse can, thus, be expressed by the following strategy: (4) [Xœr]N. masc. « [Xøz]N. fém. But the œ ~ ø alternation depends on a phonological rule of French ø ® œ/__ C $, which repairs the ill-formed *ør sequence (cf. Singh 1981 and 1987 and Sommerstein 1977). Given the existence of this phonological rule, the morphological strategy in (4) can be simplified to (5). (5) [Xør]N. masc. « [Xøz]N. fém. We can ask ourselves why is the suppression of [r] in (5) morphological rather than phonological? One of the reasons for treating it as morphological stems from the fact that French allows -rz as a word-final coda, at least in the word quatorze, but it doesnít have any word that terminates in an -œrz rime. A number of constraints on the nature of possible morphological relationships stem from our basic hypothesis: (i) No multiple morphologies: derivational, inflectional, clitic, syntactic, etc. (ii) No morphological operations on units other than the word. Hence, no unit smaller than the word (root, stem, lexeme, basic abstract form, etc.) can exist as an object of a morphological strategy. Words, therefore, have no internal morphological structure. (iii) The unity of the morphological operation, in particular no segmentation of the operation to create an intermediate level of representation between morphology and phonology; hence, no morphophonology that implies a separation between morphology and morphophonology and no affixation-truncation (Aronoff 1976), nor copying-spelling types of separation (Carrier 1975). (iv) The operation has no privileged direction so that no special status is given to back-formation as in, for example, Marchand (1969) and Kiparsky (1982).
Prolegomena to a Theory of Non-Påƒinian Morphology ! 21
(v) No category is exclusively determined morphologically. Hence, no conjugation, declension, or intra-extra-paradigmatic kind of typology (Bybee 1980, and DeChêne 1975). (vi) Morphological integrity of the wordóno adjacency condition (Siegel 1979). These constraints may seem to be negative but we think that it is in the nature of any constraint to impose restrictions and in this way strengthen the hypothesis. This should not come as a surprise to anyone. The only way to demonstrate that we do not need the excluded notions is by induction based on the analyses presented here. We claim that we do not need them. The burden of proof is, clearly, on those who want to dispense with these and introduce complicated devices to account for the facts. The rest of this paper is devoted to examining some of these notions and the arguments that gave birth to them. The arguments in favour of ëmorphophonologyí as a part of morphology are taken up in Ford (1982), Ford and Singh (1983) and Singh (1986).
Summary of the Arguments for and Against Different Types of Morphology In this section we will summarize the arguments for and against the inflectional/derivational distinction. We conclude, of course, that no argument in favour of the distinction gives an airtight criterion and that, on the contrary, by postulating that distinction we predict material differences that never manifest themselves; we also lose generalizations from the point of view of our formal constraints on the processes that are identical and, consequently, we complicate the grammar. The orthodox position on multiple morphology is clearly expressed by Matthews (1974: 38): ëAccording to the most usual division of subjects, the field or morphology is divided into two major subfields: one concerned with processes of inflection and the other with what are usually referred to as processes of word-formation (derivation and composition).í
22 ! Alan Ford and Rajendra Singh
The basic arguments in favour of the proposed distinction are: (i) (ii) (iii) (iv) (v) (vi)
(vii) (viii) (ix)
two forms of the same word versus two different words; paradigmatic motivation versus innovation and irregularity; syntactic motivation versus absence of syntactic motivation; no monomorphemic substitute versus a possibility of such a substitute; productive versus non-productive; never change the category versus can change the category (discussed in Anderson (1981) and Aronoff (1976) amongst others); dominated by X versus dominated by X-bar (Aronoff 1976); definable by extension versus indefinable by extension (Anderson 1981 and Aronoff 1976); exterior to derivational affixes versus interior to inflections (Anderson 1981).
We attempt to answer arguments (ii), (vi) and (ix) in Singh and Ford (1980). As for the others, (i) depends on the definition of the word. The dictionary test proposed by Matthews is clearly inadequate (Section 1.1 provides part of the reason why this criterion cannot be used); (ii) this criterion is at best an indication, as Matthews himself admits, since strictly applied, it forces him to declare that number in Romance languages (Italian cugino ëcousiní versus cugini) is not inflectional (an anti-intuitive conclusion according to him); (iii) simply is not true (cf. set in English); (iv) the reader is referred to the discussions in Aronoff (1976: Chapter 3) and Anderson (1981: 14). Matthews himself gives Xless as a counter-example (Matthews 1974); (v) this criterion is certainly the most interesting and Anderson (1981) believes it is the only valid one. Unfortunately, it seems to be untenable in the light of arguments like Lieberís (1981: 30). In short, if an inflectional form is available for word-formation operations (cf. Matthewsí definition cited above), it has to be in the lexicon, hence, dominated by X and not by a superior category;
Prolegomena to a Theory of Non-Påƒinian Morphology ! 23
(vi) is rejected by Anderson himself with examples from Fula and Kwakwala. Not only are the arguments in favour of the distinction untenable, but other arguments exist in favour of only one morphology. (i) in Singh and Ford (1980) we discuss the fact that the distinction is part of only one grammatical tradition and that it is neither universal nor necessary; (ii) we also show that it is not motivated diachronically as could be expected; (iii) Lieber (1981) shows that both must have the same status, namely, a lexical one; (iv) finally, nobody has been able to formulate a morphological constraint that would separate the two kinds of morphology.
Concerning Units Smaller than the Word in Morphology In this section, we limit ourselves to the notions of root, stem, affix and morpheme as bases of the morphological operation but implicitly the arguments extend themselves to any abstract notion such as the lexeme or Matthewsí and Andersonís ëbase formí (Anderson 1981 and Matthews 1974), among others. The position defended here is essentially that of Kurylowicz (1949) for whom all these notions (with the exception of the morpheme, implicitly defined in certain morphological operations) are abstractions emanating from different linguistic theories that turn out to be totally lacking psychological reality (the difficulties involved in defining the latter notion notwithstanding). We shall now give a few examples of morphological analyses that make false predictions precisely because the generalizations are arrived at with the help of roots or stems instead of words. The data come principally from Lieber (1981) and Matthews (1974).3 The morpheme, if we give it its traditional meaning (smallest unit having meaning), does not exist in our theory. This notion is a by-product of structuralism. It stems from certain principles such as double articulation or Hjelmslevís (1963) isomorphism as well as a taxonomic methodology based on the principles of segmentation and classification. A summary analysis of the structuralist class of morphemes should be sufficient to convince one that, from the point of view of
24 ! Alan Ford and Rajendra Singh
morphology, we are faced with completely different units that happen to be arbitrarily regrouped according to the structuralist perspective. It is also clear, by the subcategorizations proposed, that the structuralists themselves were at least partially conscious of the arbitrary status of the morpheme. Consider, for example, distinctions such as meaning morpheme (moneme) versus grammatical morpheme, not to say anything about the morphemic status of certain morphophonological processes and in particular the notion of zero-morpheme (Matthews 1974: Chapter 7). Some of these problems with the ëmorphemeí taken as the unit of morphological analysis are brought up by Matthews (1974: Chapter 5). Others show up elsewhere in discussions where they are not always perceived as problems. Roughly, there are at least three different conceptions of the morpheme: (i) The morpheme (moneme) as the minimal unit having a meaning. It is more or less the definition of the European structuralists (Hjelmslev 1963, Martinet 1960, Matthews 1974, and others); (ii) The morpheme as the active part of the word in wordformation processes. This is Kurylowiczís point of view (Kurylowicz 1949), also defended by Aronoff (1976); (iii) A class of allomorphs in complementary distribution. This is the American structuralistsí definition (Bloomfield 1933, Harris 1978 and Nida 1946, for example). This definition is also implicit in the morphological representations of most generative studies. One could imagine that these are three descriptions of a unique set but it is easy to demonstrate that it is not the case. We will use (6) to show that (i) and (ii) do not describe the same set. (6)
Table 1.1 verb inf.
(i) négliger (ii) (se) méfier verb inf. (iii) étonner (iv) óó (v) óó (vi) óó
Adj. masc.
Noun fem.
Adverb
négligent méfiant Adj. masc. étonnant fréquent présent méchant
négligence méfiance Noun fem. óó fréquence présence óó
négligemment óó Adverb étonnement fréquemment óó méchamment
Prolegomena to a Theory of Non-Påƒinian Morphology ! 25
According to the first definition of morpheme, the [÷CL] in (6iñiii) is a morpheme to which we can roughly attribute the meaning ëthe fact of Xerí where ëXerí is the verb with which it is related. On the other hand, in (6ivñvi), [÷CL] cannot be a morpheme since the adjective in question does not have a corresponding verb. For example, fréquent cannot have the meaning ëthe fact of fréquerí since fréquer does not mean anything in French. According to definition (ii) of the morpheme, all [÷CL] are morphemes since they participate in word-formation operations. In the following strategies that describe the formation of nouns and adverbs from adjectives, the constituent [÷CL] is explicitly mentioned: (7) (i) [X÷CL]Adj. masc. « [X÷CLs]N. fém. (ii) [X÷CL]Adj. masc. « [Xam÷CL]Adv. Whether [÷CL] is a morpheme or not according to definition (i) has no importance. We conclude that (i) and (ii) do not define the same set. As for (iii), it defines still something else; for example, the abstraction underlying the set -z, -n, -r n in the English nouns boys, oxen, children. It is what we usually represent, following Chomsky and Halle (1968), by a feature [+plur]. This definition of the morpheme isolates a class of meaning (sememes) that does not coincide in any way with the sets created by definitions (i) and (ii). Definition (iii) has no formal constraint and can permit for example, the elements of meaning shared by the words mushroom and mycology to be defined as a morpheme. We can see that according to the definitions given the term morpheme refers to: (i) a set of meanings each systematically associated to a form; (ii) a set of forms; (iii) a set of meanings. It should be clear by what has been said earlier that it is only class (ii) that is interesting for morpology. If one must use the term morpheme in morphology, it can refer only to (ii). But given the present confusion, it is probably better to abstain from using the term and to conclude that the notion of morpheme is not useful in morphology. The expression ëword-partí will be a happier choice. The theoretical framework sketched out above brings out the heterogeneity of the ëmorphemeí class and also explains and somewhat
26 ! Alan Ford and Rajendra Singh
clarifies the prevailing confusion. But it must be clear, that within our framework there is more than one way of creating ëword-partsí: (i) A word-part can be the object of a morphological operation: affixation or truncation. For example, in the following rules the ëmorphemesí mal-, -el, -o, -ais are respectively prefixed, suffixed, infixed or deleted. (8) [X]Adj. « [malX]Adj. mal(commode), mal(habile), mal(adroit), etc. [X]N « [Xèl]Adj. mort(el), form(el), résidu(el), etc. [X (rime)]Adj. « [XoY]Adj. anglo-américain, hollando-belge, etc. (ii) Word-parts can be used in another way, by using them as class markers. For example, in the following strategies, (9) [Xal]N. sing « [Xo]N. pl. cheval-chevaux, mal-maux, etc. [X]Adj. « [ilX]Adj. legal-illegal, lisible-illisible, etc. -al and -l serve to mark classes but they are not ëmorphemesí in the same way as those in (i). Support for this point of view comes from the fact that for class (ii) elements, it is often possible to substitute them with a variable. For example, in a famous rule, the plural-formation of nouns in Cochabamba Quechua: (10) [XC]N. sg. « [XCkuna]N. pl. [XV]N. sg. « [XVs]N. pl. We have yet to find a reason to give particular importance to these facts. There are other classes of morphemes which we can arrive at by other means of segmentation but that, for various reasons given below, are not useful classes, and are, consequently not explicitly representable in our framework. Let us now look at the arguments in favour of the existence of roots and stems. Matthews (1974: 79) criticizes Priscian, the Roman grammarian, for postulating a morphological strategy that can be represented in the following way:
Prolegomena to a Theory of Non-Påƒinian Morphology ! 27
(11) [Xs]V2
sg. pres.
« [Xbam]V1
sg. imper.
According to Matthews, this analysis of the relationship between forms such as amas-amabam is untenable since it stops one from formulating generalizations concerning the distribution of -ba and -m, forms that appear separately in other words such as amabat and amarem. Amabam must therefore be analyzed as ama-ba-m or even as: (12) [[[[am]root -a]stem -ba]stem -m]word This analysis, obviously, calls for the creation of a series of terms for word-parts. (13) root: stem:
am am(amo), ama(amas), amaba(amabam), amav (amavi), etc. inflectional formants: -ba, -m, -i, -o, etc. ending: i, o, etc.
Note that Matthewsí argument for roots and radicals is that they are essential to permit generalizations of the type: (i) -ba is an imperfect tense marker and (ii) -m is a first person marker. But consider the following two strategies: (14) [Xat]V 3rd sing. present « [Xabat]V 3rd sing. imperfect [Xm]V 1st sing. present « [Xs]V 2nd sing. present They clearly permit us to state that -m is a first person marker and that -ba- is an imperfect tense marker. Yet this analysis does not need any notion of root or stem. Matthewsí argument in favour of these categories, therefore, is not convincing. According to Lieber (1981: 10), the use of stems (in the lexicon) allows for generalizations that are essential. For example, Wurzel (1970) was the first to note that the German noun structure is essentially [[X]stem Y]inflection and not [[X]root Y]inflection. (15) N A G D
Bach Bach Bachs Bach
Bäche Bäche Bäche Bächen
Vater Vater Vaters Vater
Väter Väter Väter Vätern
Geist Geist Geistes Geist
Geister Geister Geister Geistern
Name Namen Namens Namen
Namen Namen Namen Namen
If we postulate [root + inflection], we have four conjugations:
28 ! Alan Ford and Rajendra Singh
(16)
Sg Ø Ø s Ø
Pl e e e en
Sg Ø Ø s Ø
Pl Ø Ø Ø n
Sg Ø Ø es Ø
Pl er er er ern
Sg Ø n ns n
Pl n n n n
This analysis, the argument goes, does not let us conclude that /s/ is explicitly the genitive singular marker, or that /n/ is the dative plural marker. On the other hand, an analysis that, besides the root, explicitly uses stems, (as in 17): (17)
Root Bach Vater Geist Namen
Stem Bäche Väter Geister Namen
permits the following generalizations: (18) Nom Acc Dat Gen
sg ² ² ²
Nom Acc Gen Dat
pl ² ² ²
® root, except if the noun ends in -n, then n ® Ø ® root + s except if the noun ends in -st, then + es ® root ® root + n except if the noun ends in -n, then n ® Ø
But notice that we can very well state these facts without using roots or stems. (19)
(i) a. b. (ii) a. b.
[X]Nom. « [X]Acc. [X ]Nom. sing. « [X n]Acc. sing. [X]Nom. sing. « [Xs]Gen. sing. [X]Nom. plur. « [X]Gen. plur.
(iii) a. b. (iv) a. b.
[X]Nom. « [X]Dat. [Xr]Dat. sing. « [Xrn]Dat. plur. [X]Nom. sing. « [X]Nom. plur. [XC]Nom. sing. « [XC ]Nom. plur.
Prolegomena to a Theory of Non-Påƒinian Morphology ! 29
c. [X ]Nom. sing. « [X r]Nom. plur. d. [Xst]Nom. sing. « [Xster]Nom. plur. The analysis presented in (19) also lets us make the generalization that ëstemsí are available for derivational word-formation like in Bücherfolge ëbook seriesí. According to Lieber (1981: 15) these forms are made up of stem + noun but note that we can also analyze them as N pl. + N. It has been proposed by Kiparsky (1982) and others that the regular plural should not be available for compounding processes but it can be shown that in various languages the regular plural is absolutely necessary (cf. Spanish limpia-botas that can only be analyzed as [X]V « [X]N pl. and not [[X]V « [Y]N -s]N pl since this word is singular in Spanish). We conclude that the arguments in favour of roots and stems are not convincing. There are also disadvantages in using these notions. Here are three additional arguments against the use of roots and stems. (i) They are not ëpsychologically realí, i.e., they are antiintuitive and badly defined compared to a notion like the word. (ii) No category can be attached to them. In particular, a root can be the entry to a morpholexical rule whose output can be various stem categories. (iii) An apparent advantage of these notions is that they allow us to avoid using the operation of substitution (truncation). Inasmuch as the latter is necessary, the advantage remains merely apparent (Ford and Singh 1985).
On Morphophonology In certain grammatical traditions, the morphological operation cannot be broken up into various kinds of suboperations. For Kurylowicz (1949), for example, Baum and Bäumchen are tied by a complex suffixation and umlaut operation. These operations are considered as being of a different nature in other schools. Andersen (1969) for example, criticizes Kurylowicz for having mixed two different levels of representation in the morphological operation. In the above example, the suffixation of -chen is a morphological operation but the umlaut is a morphophonological operation.
30 ! Alan Ford and Rajendra Singh
The position adopted by Kurylowicz represents that of traditional historical linguistics, and Andersenís that of the Prague structuralists under the influence of Trubetzkoy (1929) to whom we owe the term ëmorphophonemeí and that of Bloomfieldís followers to whom we owe the development of the domain of morphophonology also called morphophonemics. A third position can be added to the first two: that of Chomsky and Halleís Generative Phonology (1968) and its off shoots that profit from the separation of morphophonology from morphology by treating it as part of phonology proper. More recently, their position has come under attack by Hooper (1976), Hudson (1975) and Vennemann (1974). They give an independent status to morphophonology. We can schematize this as in Figure 1.1. Kurylowicz Chomsky & Halle Hooper (1968) (1949) (1976)
Dressler (1985)
Phonologie
Morphophonologie
Morphologie
Figure 1.1
Even if the positions taken by Chomsky and Halle (1968), Hooper (1976) and Kurylowicz (1949) do not require further elaboration, that of Dressler (1985) calls for some clarification. Dressler (1985) maintains that morphophonology constitutes neither part of phonology nor an autonomous module, but a quasi-module which shares certain properties with both phonology and with morphology. Although he makes some convincing arguments against the autonomy of morphophonology and its incorporation into phonology, he does not, we believe, demonstrate conclusively that it cannot be incorporated into morphology. His brilliant description of morphophonology
Prolegomena to a Theory of Non-Påƒinian Morphology ! 31
does not, we regret, prevent us from considering it a part of morphology, perhaps on the model of another part of morphology which could perhaps be called ëaffixologyí (for more detailed arguments, see Singh (1987, and forthcoming). From what has already been said, it is clear that, concerning this problem, Kurylowiczís position is closer to the theory defended here than the others are (for details, see Ford and Singh 1983). Basing ourselves on the theory of phonology hinted at above, it is clear that the changes triggered by a particular morpheme cannot be phonological, i.e., they are not motivated by a well-formedness condition. No well-formedness condition of German excludes Baume. Umlaut is therefore not a phonological operation in our terminology. We must still see if this kind of operation has a special status compared to the other morphological operations as the supporters of morphophonology claim. Let us examine the arguments for the existence of a morphophonological level between morphology and phonology. First of all we will look at the arguments of Andersen (1969), who claims that morphophonological rules have an independent status since they can evolve diachronically in different ways: either they become morphological processes or they go back to being phonological. Andersen identifies these two directions of evolution with two steps in a cycle: ëwe identified two phases in each morphophonemic change: a covert phase, consisting in the formulation of a new morphophonemic rule, and an overt phase consisting in the gradual elimination of lexical exceptions to that rule.í (Andersen, 1969: 828). The Ukrainian example he gives to support the second possibility seems questionable to us (and impossible to account for in our theory). According to his analysis, the prefix ob ëaroundí triggers a morphophonemic rule of vocalic epenthesis (ëthe Paragoge ruleí) in the production of obibjut ëto nail aroundí formed from bjut ëto nailí. According to our point of view, such an analysis adds an extra level to the derivation, failing to recognize the vocalic epenthesis for what it really is; to wit, a morphological operation. Moreover, Andersen himself supplies an argument for this position when he states: ëWe saw above how the domain of the Paragoge rule has changed since the fall of the jers, so that it now applies in environments which are narrower, in others wider, than originally.í (Andersen 1969: 818). The ability to change the scope of its application is the hallmark of
32 ! Alan Ford and Rajendra Singh
a word-formation rule, a significant difference between a morphological strategy and a phonological rule, which is, in the theoretical framework we adopt, always global. Bybee (Bybee 1980, Bybee and Brewer 1980, and Hooper 1979) claims that morphophonological rules play a role in the acquisition and the diachrony of Spanish. Ford (1980) presents another analysis of the same facts without using morphophonological rules. We shall not repeat here the contents of that paper. But the basic argument for the incorporation of morphophonology into morphology is that where the morphological operation is multipleói.e., where it comprises an operation like affixation, truncation or substitution besides a morphophonological operationóthe latter never evolves alone. If the domain of the morphophonological operation is extended, it does so with its corresponding morphological operation. In this way we agree with Kurylowicz when he says that the morphophonological operation is secondary or dependent. The German umlaut example serves to illustrate. A morpheme that does not trigger umlaut has never started triggering it. Another example would be the voicing of [f] in the pluralization process of certain English words such as wife « wives. The prediction is that this voicing process cannot extend itself to a morpheme that does not trigger it (for example, the possessive morpheme -z). As far as putative counter-examples to the claim that morphonology does not generalize diachronically are concerned, Singh (1996) argues that given Malkielís (1982: 248) withdrawal of his (1976) Spanish case, there is not much left except for the recent persuasive case, presented in Morin, Langlois and Varin (1990). They argue that ® o, the unquestionably morphonological rule of seventeenth century French, generalizes to become what Morin and we would call a global, purely ëphonologicalí rule. Four things are clear about this rule: (i) it was a ëmorphonological ruleí, (ii) it now clearly is a ëphonological ruleí, (iii) it never generalized in the sense that it never marked or penetrated word-formation processes with which it was not originally associated, and (iv) it merely diffused lexically, at first in the original category which it was associated with and later cross-categorically as a lexical redundancy rule. The appearance of it having generalized to a ëphonological ruleí is largely a consequence of the fact that the other major category, the verb, happened not to have any word-final / /¢s. (Morin, et al. 1990: 516).
Prolegomena to a Theory of Non-Påƒinian Morphology ! 33
The generalization of a ëmorphophonological ruleí qua morphophonological rule must, it seems to us, includes a takeover of morphological categories, something not exhibited by -tensing, for it goes from plural -tensing to a progressively more cross-categorial redundancy rule, ending up as the well-formedness condition without ever being triggered by another word-formation process. Although Morin, Langlois and Varin are, in a sense, right that it, or more accurately, the neutralization it describes, is phonologized, their case seems to be a case of ëphonologizationí without generalization of the sort we take to be criterial for the issue at hand. It seems to involve simplification rather than the generalization of a ëmorphophonological ruleí qua ëmorphophonological ruleí: by the time it can be said to have penetrated another word-formation process, it has acquired the status of an automatic alternation, at least for innovative speakers. Although it is one of the most carefully documented studies of what the authors refer to as the progressive stripping of a morphological rule and lexical conditions on a ëmorphophonological ruleí, it need not, fortunately, be construed, nor is it intended, as presenting a case for a ëmorphophonological ruleí violating the prohibitions formulated in Ford and Singh (1983: 67):4 Fourth, morphophonological alternations do not act independently in the historical evolution of a language. We do not wish to affirm by this that one cannot witness the extension of a morphophonological process from generation to generation as the opposite is easy to attest. Neither do we wish to affirm that a morphophonological alternation could not be produced spontaneously during the evolution of a language. Picard (1977) gives excellent examples of this type of phenomenon, especially the deletion of l in Québec French. What we wish to affirm is that a morphophonological alternation is not dynamic in the sense that it can wander from one morphological context to another. For example, we would like to claim that the mark of the imperfect or the morpheme -los could not trigger umlaut in German and that diminutive -ito/-ita could not trigger a monophthongization in the Spanish noun. We think, until proof to the contrary is offered, that historical changes of this form are impossible and that a morphophonological process cannot be mobile in this sense.
34 ! Alan Ford and Rajendra Singh
Fifth, a morphophonological alternation can spread without there being present the phonological conditions that were present at its appearance. For example, the origin of German umlaut is to be found in vowel harmony that worked from right to left and was triggered by an anterior high vowel. To cite a concrete example, the diminutive suffix -kin triggered umlaut when it was added to the noun, hence Baum-kin > Bäumkin. Subsequently the suffix changed as the result of historical processes to -chen. The phonological condition which triggered umlaut the vowel i in the suffix, is no longer present, but the suffix -chen continues to trigger umlaut even when it is added to new forms. For example, Elefant, a loan word, has the diminutive Elefäntchen.
Morphology and the Lexicon Having argued that the sub-field of lexical relatedness that is properly called morphology can be described in terms of what we call WFSs, we should clarify their status. We see them as describing relationships amongst items that can be said to exist in the lexicon. They describe, in other words, the patterns that exist in the lexicon and have only a limited number of moments of reality (cf. Hetzron 1972 and Vennemann 1974). We reject the standard separation of morphology from the lexicon and take the view that the former is simply the form in which the latter comes. This allows us to make the lexicon much more directly responsible for what is called morphology, without, of course, the intervention of filters (cf. Halle 1973), blocking (cf. Aronoff 1976), or strata (cf. Kiparky 1982). The majority of contemporary models of the lexical component seem to divide what they call the lexicon into two components, one for stem morphemes and another one for affixes. Some divide the second into two separate warehouses: one for what they call inflections, and one for what they call derivational affixes. Those, like Aronoff (1976), who introduce the affixes with rules, make a distinction between the component where the possible inputs to these rules live and the rule component itself. The only difference between Halle (1973) and Aronoff (1976), for example, seems to be that Aronoff insists that only full existing words can be inputs to his WFRs. Listing affixes à la Selkirk (1982) or introducing them à la Aronoff probably, as Kiparsky (1982) argues, amounts to the same
Prolegomena to a Theory of Non-Påƒinian Morphology ! 35
thingói.e., if the theory of minimal signs is accepted or, equivalently, the theory that only words can be signs is rejected or ignored. It is interesting to note that Halle (1989), as (21) shows, splits the lexical component even more violently. (21) Morphology Vocabulary
Morphemes
DS
Words
SS
Readjustment
PF
Spell-out
LF
Phonology Figure 1.2: Halleís 1989 # 12
We have, however, argued for a more richly articulated singular lexical component without an autonomous rule component. Patterns, which we believe can be used to analyze new words or create them, are established in that component itself. Words, in other words, are stored integrally and the relationships amongst them are formalized by redundancy rules that live exactly where the words do. They do not live only next to them, they live inseparably from them. Once learned, they can be used to create or parse new words. The patterns described by our WFSs are available not in a separate component but in the very same component. They are more than integral parts of that component: they are constitutive of it. Notice that we avoid the Hallean (1973) trichotomy or double dichotomy (Halle 1989). Whereas he has a dictionary, a list of ëstem morphemesí, and an autonomous rule component he calls morphology, or a list of words, a list of morphemes and morphology. We have argued that a speaker has only something more like a dictionary of words as richly or poorly specified as phonology and semantics permit and demand, except that that dictionary does not look anything like what is indicated by
36 ! Alan Ford and Rajendra Singh
that term. Neither words nor affixes exist independently of the words they give birth to or of which they are born. Morphology, in other words, is not a way of structuring the lexiconóthe autopoiesis in question is the very form in which the lexicon comes. To conclude: the lexicon is a set of creative processes determining as much the form of the words it generates as their eventual use. Its general form is that of a set of word-formation strategies linked in a network. Every node in the network is a word whose minimal properties are a phonological form and a syntactic category. Its semantic value, i.e., its use, is determined by the scripts and metaphors of the realm of discourse where it occurs, by the context of its use. A WFS is simply a description of a particular part of the network.
Speech Errors and Integral Listing As most models of morphology, with the possible exception of Selkirk (1974), subscribe to Halleís view (1973: 16) that the word-formation component (or morphology in his [1989] proposal) is used only under exceptional circumstances since retrieval under normal circumstances is effected from the dictionary supposed to be stored in the speakerís permanent memory, it is legitimate to ask why what we may somewhat inaccurately call the enriched dictionaryóenriched along the lines suggested in Ford and Singh (1985)óis not enough to do for what needs to be done. Given the existence of some dictionary in permanent memory that, as even Halle admits, is called upon in normal circumstances, one should derive morphology from it rather than have the former produce it. The backward grammars PaÌnÚinians have successfully passed off as structuralism, not only confuses the sum of parts with the whole and cause with effect, but also makes it impossible not to separate the form from the content (cf. Zager 1979). As far as the abnormal circumstances are concerned, it is easy enough to show that an integral listing (IL) approach such as ours that integrates the words and the WFRs can easily account for normal and pathological speech errors that lead Caplan, et al. (1972: 172) to the view that ëwords and rules are autonomous and can be differently affectedí. As a detailed description of the IL model is given in Martohardjono and Singh (1992), we shall take up only Caplan, Kellar and Lockeís argument.
Prolegomena to a Theory of Non-Påƒinian Morphology ! 37
Caplan, et al. suggest that a word-finding programme alone cannot explain neologistic facts. They do not, like Luria and Brown, discuss possible phonemic difficulties, but instead advance the possibility of a separate breakdown in a postulated rule-component. It cannot be established from normal speech whether rules and words are autonomous or interdependent. However, according to these authors, neologisms suggest that such a separation is called for. Thus they write (ibid.: 172): The separation of the possibilities cannot be made on the basis of normal language, and, indeed, even in distorted language, without neologisms the distinction is impossible as an abnormality in word-finding alone might give the appearance of an abnormality in application of rules. The neologism, however, presumably representing a word not found, can be examined for appropriate inflection independent of its semantic value. If, on some occasions, it is properly inflected, acceptable rules are apparently in action; if on other occasions inflection is incorrect, the rulesó previously demonstrated to existóare misapplied or not used at all. They state that their data accord with the view that ëwords and rules are autonomous and can be differentially affectedí by aphasia. Although this is certainly a plausible hypothesis, we do not think that their data (at least as presented in the article) necessitate it. For example, the cite the error I persets, where perset is the presumed neologistic stem, misinflected for the third person sg. Notice that this analysis is made on the assumption that perset is the ëbase formí. It seems to us equally plausible to say that this is not an error in inflection, for it could be argued, especially if we look at the next example given (persessing), that the base form is persess and that the [t] slipped into the utterance. This is not such a far-fetched proposition, given that the patient seemed to be having particular phonemic problems in producing that string anyway: ëMany things I didnít have years agoth and I persets abowth abrow.í If, on the other hand, one wanted to stay away from phonetic explanations, there is even another morphological alternative: one could take persets as what some call a ëstemí (there are many English verbs that end in -s) and in this case we would not have to do with an error of inflection. Obviously, we are not arguing that these analyses are superior to that of the
38 ! Alan Ford and Rajendra Singh
authors; we are merely pointing out that the data do not force the postulation of the autonomous rule-component. If, on the other hand, we wanted to explain these errors within the framework of an IL lexicon, we would not have to say, as these authors do, that the jargonaphasic has two separate problems: one of word-finding and one of rule-application. We could simply say that the neologism, once formed (whether it be by a random syllable generator or any other device), is used like any word of the specific category it replaces. It can then be stored in the lexicon and as such can follow the patterns already established there. Thus, if it replaces a verb, it can (but does not have to) acquire inflectional endings given by patterns already existing in the lexicon. If it replaces a noun, however, it would not have to be formed, as an Autonomous Rule Component (ARC) model would require, by a derivational rule, since words are not divided into stems and derivational affixes in an IL model. They are stored in toto although they can be analyzed, if one so wishes, into parts. This view would free us from having to explain why neologistic nouns never display derivational endings.
Notes 1. This text is a revised and expanded version of Ford and Singh (1991). We are grateful to Gita Martohardjono and David Parkinson for assistance far beyond the call of duty. Work on this paper was in part supported by grants from Social Sciences and Humanities Research Council of Canada (SSHRC) and Fonds pour la formation de chercheurs et líia·de à la recherché (FCAR), Québec. 2. It seems necessary to state this since the type of analogical process that results from the application of a morphological strategy in the production of a new form has recently been characterized as being extragrammatical [5, 11]. According to these authors, the sentences of any language always contain a number of acceptable but ungrammatical sentences. Among these some have a corresponding grammatical form and are only errors. Others seem to be perfectly acceptable. (i) Bill will try and do it = Bill will try to do it Heís good and willing = Heís very willing According to them, these structures and others are resistant to syntactic analysis. They are examples of analogy (the extension of existing forms to new functions). We reject the dichotomy.
Prolegomena to a Theory of Non-Påƒinian Morphology ! 39 3. We take up the analysis of Latin by Matthews and of German by Lieber only as examples. A Påƒinian analysis of any language can be shown to suffer from the same defects. 4. It is presented as a case against Hooper (1979: 91) who rules out the stripping of conditions on a ëmorphophonological ruleí as a possible evolutionary development. The conditions under which the stripping in question can lead to the ëphonologizationí of an alternation expressed by Hooper as an MP-rule must be studied very carefully, for even a case of the sort that Morin et al. make is not all that common.
References Andersen, H. 1969. ëA study in diachronic morphophonemics: The Ukranian prefixesí. Language. 45(4): 807ñ30. Anderson, S. 1981. ëWhereís morphology?í, in Inflectional Morphology: Introduction to the Extended Word-and-Paradigm Theory, ed. by T. ThomasFlinders, pp. 1ñ40. UCLA Occasional Papers 4. Working Papers in Morphology. Aronoff, M. 1976. Word Formation in Generative Grammar. Cambridge, MA: MIT Press. Benson, J.W. 1990. Pata¤jaliís Remarks on A∆ga. Delhi: Oxford University Press. Bever, T.G., J.M. Carroll and R. Hurtig. 1976. ëAnalogy, or Ungrammatical sequences that are utterable and comprehensible are the origins of new grammars in language acquisition and linguistic evolutioní. In An Integrated Theory of Linguistic Ability, ed. by T.G. Bever, J.J. Katz and T.D. Langendoen, pp. 149ñ83. New York: Thomas Y. Croll Press. Bloomfield, L. 1933. Language. New York: Holt, Rinehart and Winston. Bybee, J. 1980. ëMorphophonemic change from inside and outside the paradigmí. Lingua. 50: 45ñ59. Bybee, J. and M.A. Brewer. 1980. ëExplanation in morphophonemicsí. Lingua. 52: 201ñ42. Caplan, D., L. Kellar and S. Locke. 1972. ëInflectional neologisms in aphasiaí. Brain. 95: 169ñ72. Carrier, J. 1975. Reduplication in Tagalog. Unpublished ms. Cambridge, MA: MIT. Carroll, J.M. 1974. ëLinguistic performance and diachronic analogyí. Paper presented at the Summer Meeting of the Linguistic Society of America, 1974. Amherst, MA.
40 ! Alan Ford and Rajendra Singh Chomsky, Naan and M. Halle. 1968. The Sound Pattern of English. New York: Harper and Row. DeChêne, B. 1975. ëThe treatment of analogy in a formal grammarí. In Papers from the Eleventh Regional Meeting of the Chicago Linguistic Society, ed. by R.E. Grossman, L.J. San and T.J. Vance. Chicago: CLS. 152ñ64. Dressler, W.U. 1985. Morphonology: The Dynamics of Derivation. Ann Arbor, MI: Karoma. Ford, A. 1980. ëUne analyse morphologique du paradigme verbal en espagnolí. Paper read at the Annual Meeting of the Canadian Linguistics Association. Ford, A. and R. Singh. 1983. ëOn the status of morphophonologyí. In Papers from the Parasession on the Interplay of Phonology, Morphology and Syntax, ed. by J. Richardson, M. Marks and A. Chukerman. Chicago: CLS. 63ñ80. óóó. 1985. ëOn the directionality of word-formation rulesí. In Eastern States Conference on Linguistics 84, ed. by Alvarez Geloria, Belinda Brodie and Terry McCoy, pp. 205ñ13. Columbus: Ohio State University. óóó. 1991. ëPropédeutique morphologiqueí. Folia Linguistica. 25(3/4): 549ñ75. Halle, M. 1973. ëProlegomena to a theory of word-formationí. Linguistic Inquiry. 4(1): 3ñ16. óóó. 1989. ëAn approach to morphologyí. In Proceedings of North East Linguistic Society 20, ed. by J. Carter et al. Amherst, MA: University of Massachusetts Press. 155ñ84. Halifax, N.S. and Ford, A. 1982. ëLa place de la morphophonologie dans la grammaireí. Revue de líAssociation Québécoise de Linguistique. 2(2): 73ñ80. Harris, J.W. 1978. ëTwo theories of non-automatic morphophonological alternations: Evidence from Spanishí. Language. 54: 41ñ60. Hetzron, R. 1972. ëThe shape of a rule and diachronyí. Bulletin of the School of Oriental and African Studies. 35: 41ñ60. Hjelmslev, L. 1963. Prolegomena to a Theory of Language. Madison: University of Wisconsin Press. Hooper, J.B. 1976. An Introduction to Natural Generative Phonology. New York: Academic Press. óóó. 1979. Child morphology and morphophonemic change. Linguistics. 17: 21ñ50. Hudson, G. 1975. Suppletion in the Representation of Alternations. Ph.D. dissertation, UCLA. Kiparsky, P. 1982. ëLexical morphology and phonologyí. In Linguistics in the Morning Calm. ed. by I.S. Yange. Seoul: Hanshin. 3ñ91. Kurylowicz, J. 1949. ëLa nature des procès dits «analogiques» Acta Linguistica. 5: 121ñ38. Reprinted in Readings in Linguistics II, ed. by E. Hamp, F. Householder and R. Austerlitz. Chicago: University of Chicago Press, 1966. 158ñ74.
Prolegomena to a Theory of Non-Påƒinian Morphology ! 41 Lieber, R. 1981. On the Organization of the Lexicon. Ph.D. dissertation, MIT. Bloomington: Indiana University Club. Malkiel, Y. 1976. ëMulti-conditioned sound change and the impact of morphology on phonologyí. Language. 52: 757ñ78. óóó. 1982. ëInterplay of sounds and forms in the shaping of three Old Spanish medial consonant clustersí. Hispanic Review. 5(8): 247ñ66. Marchand, H. 1969. The Categories and Types of Present Day English Wordformation. Munich: Beck. Martinet, A. 1960. Éléments de Linguistique Générale. Paris: Armand Colin. Martohardjono, G. and R. Singh. 1992. ëIntegral listing and speech errorsí. Linguistische Berichte. 139: 153ñ68. Matthews, P.H. 1974. Morphology: An Introduction to the Theory of Wordstructure. Cambridge: Cambridge University Press. Morin, Y.C., M.C. Langlois and M.E. Varin. 1990. ëTensing of Word Final [ò] to [o] in French: The phonologization of a morphophonological ruleí. Romance Philology. 18(4): 507ñ28. Nida, E.A. 1946. Morphology: A Descriptive Analysis of Words. Revised edition. Ann Arbor: University of Michigan Press. Picard, M. 1977. ëLes règles morphophonémiques en diachronieí. Recherches linguistiques à Montréal. 8: 129ñ36. Selkirk, E. 1974. ëFrench liaison and the X-bar conventioní. Linguistic Inquiry. 5: 573ñ90. óóó. 1982. The Syntax of Words. Cambridge, MA: MIT Press. Siegel, D. 1979. Topics in English Morphology. New York: Garland. Singh, R. 1981. ëIn defense of the Universal Syllabic Templateí. Recherches Linguistiques à Montréal. 17: 119ñ33. óóó. 1986. ëOn finding a place for Trubetzkoyís brain-child: A review of W. Dresslerí. Morphonology: the Dynamics of Derivation. Canadian Journal of Linguistics/Revue canadianne de linguistique. 31(4): 343ñ63. óóó. 1987. ëWell-formedness conditions and phonological theoryí. In Phonologica 1984, ed. by W.U. Dressler, H.C. Luschützky, O.E. Pfeiffer and J.R. Rennison. Cambridge: Cambridge University Press. 273ñ85. Singh, R. ëNatural Phono(morpho)logy: A view from the outsideí, in Natural Phonology: The State of the Art, ed. by B. Hurch and R. Rhodes. pp. 1ñ37. Berlin: Mouton. Singh, R. and A. Ford. 1980. ëFlexion, dérivation et Påƒinií, in Progress in Linguistic Historiography. Studies in the History of Linguistics 20, ed. by K. Koerner. Amsterdam: Benjamins. 323ñ32. Sommerstein, A.H. 1977. Modern Phonology. London: Edwin Arnold. Trubetzkoy, N.S. 1929. ëSur la Morphonologieí. Travaux du Cercle Linguistique de Prague. 1: 85ñ88. Vennemann, T. 1974. ëWords and syllables in natural generative grammarí. In Papers from the Parasession on Natural Phonology, ed. by A. Bruck, R.A. Fox and M. Lagaly. Chicago: CLS. 346ñ74.
42 ! Alan Ford and Rajendra Singh Wittgenstein, L. 1953. Philosophical Investigations. Translated by G.E.M. Anscombe. New York: Macmillan. Wurzel, W.U. 1970. Studien zur deutschen Lautstruktur, Studia Grammatica 8. Berlin: Akademie-Verlag. Zager, D. 1979. ëChanges in inflectional paradigms: In pursuit of the morphemic beastí. In The Elements: A Parasession on Linguistic Units and Levels, ed. by P.R. Clyne, W.F. Hanks and C.L. Hofbauer. Chicago: CLS. 285ñ95.
Some Advantages of Linguistics without Morpho(pho)nology ! 43
2 Some Advantages of Linguistics without Morpho(pho)nology1 Alan Ford and Rajendra Singh The reintegration of the operations said to be morpho(pho)nological within morphological operations allows, in a natural way, the competition which has always existed between word-formation strategies and morphonological operations, in particular where the latter aim at rendering all the formal differences between two words. For example, if instead of postulating, as does Bybee (1980), that a morphophonemic operation intervenes to insure the vocalic alternation which we observe in the first syllable of the two Spanish words cierro ëI closeí and cerramos ëwe closeí, we integrate this alternation in a global morphological operation of the form in (1): (1) [Xo]Vl ® [XeRámos]Vlp which is directly in competition with operation (2): (2) [Xo]Vl ® [Xámos]Vlp we are able to explain naturally the existence of two dialects or idiolects where we produce either cerramos or cierramos as a form of the first plural person, either cierro or cerro as a form of the first singular person in the present of the indicative. This analysis predicts correctly the existence of three dialects which use respectively cérro and cerrámos, cierro and cierramos and cierro and cerramos, and explains the non-occurrence of a dialect with cerro and cierramos. Our morphology is an autonomous component of a generative grammar, clearly identifiable by the distinctive form and the order of its operations on the one hand, and by the directionality of its relations with the other components on the other.2 The morphological component takes the form of a set of wordformation strategies of which the output feeds the syntactic
44 ! Alan Ford and Rajendra Singh
component and, throughout it, the phonology of the grammar. The corpuscles3 of which it assures the generation branch on information which, in the spirit of the late David Bohm (Bohm and Hiley 1975), takes a wavy form, that of the scenes and metaphors of the ethnic group that uses the language. The word that morphology defines therefore becomes a quantum of information, a piece in the speakerís game. From an Aristotelian point of view, the morphological operations could have appeared like those in (3), which have undergone an evolution into two different visions of language, that of Aristotle himself, to whom we could retrospectively attribute the schema in (3) (3)
Morphological processes
Different
Simple
Mixed (affixal+non affixal) electric Adj/electricity N
Affixal
Prefixal (unpleasant)
Similar chairN/chair verb
Non affixal
Suffixal (banality)
Segmental houseN /house V
Musical pr˙test N/protést
and that of a modernism rooted in phonetics, so cherished by structuralists, which reduces the schema in (3) to (4): (4)
Morphological processes Affixal
Prefixal
Suffixal
This reduction of the range of morphological operations was made essentially at the expense of phonology, which has seen itself being attributed to the whole set of non-affixal and mixed categories, although another reduction was done by assimilating the category of
Some Advantages of Linguistics without Morpho(pho)nology ! 45
the similar with that of the different, modulo the introduction of a differentiating element called ëzeroí. But one of the highest costs for our science takes the form of the birth of morphonology which constitutes a supplementary effort to classify, among others, the leftovers of partial assimilation of the part said to be phonological of the mixed category in phonology. Our own research has led us further towards the reduction of the range of morphological operations, but in our case this reduction was effected by avoiding the complication of phonology. According to our 1991 aphorism, the morphological operations are limited to one, which we formulate as (5a): (5) a. The morphological theory exposed here rests on the following hypothesis: the morphological relation between two words, morphologically linked and of the same language, is expressed by a generative rule having the form in (b) hereafter (which we henceforth call a ëmorphological strategyí): b. [X]A ↔ [X¢]B where (i) (ii) (iii)
[X]A and [X]B are words, A and B are categories, ë ↔ í indicates an equivalence relationship, i.e., a bidirectional implication, (iv) X and X¢are semantically related, (v) the prime [¢] symbolizes a formal difference between the two elements of the relationship, henceforth called the morphological ëoperationí or ëconstantí, (vi) the prime [¢] can be null if A ¹ B (Ford and Singh 1991).
The objections to this vision of language and, in particular, to the role that morphology plays in it, seem numerous in generative linguistics currently where, viva voce, the identification of morphological operations with those of syntax on the one hand, or with those of phonology or of a so-called morphonology, on the other, is claimed. That is why, in this article, we will respond to some of the objections to the existence, partial or total, of morphology, or at least, of morphology as we perceive it. Sadock (1980 and 1991, among others) calls for a place in grammar for a prelexical syntax, a place to house an operation which he
46 ! Alan Ford and Rajendra Singh
identifies under the generic ënoun incorporationí which would be indispensable to render the structure and behaviour of certain verbs of Inuttitut, Southern Tiwa and probably, in the spirit of his take on things, of analogous phenomena which characterize many other languages. Even if this class of verbs results, as he admits, in the addition of suffixes, used to ëderiveí them from substantives, Sadock does not want to call these operations ëmorphologicalí. The reason is that the verbs which result, of which some examples are listed under (6) accompanied by the words belonging to these structures4 from where, according to Sadock, they derive, have many properties that we can explain ënaturallyí by simply using a syntactic operation of noun incorporation. (6) (18) 5 (19) (20)
Noun qimmiq ëdogí ABS. sapangaq ëpearlí nirriq ëtableí
Verb qimmiqarpuq ëhe has a dogí sapangarsivuq ëshe buy pearlsí nirriviliurpuq ëhe has dressed the tableí
These properties, which identify words like qimmiqaepuq, sapangarsivuq and nirriviliurpuq as syntactic structures, are, again according to Sadock, the following: A. Modification; B. Possession; C. Reference. A. Modification: a verb modifier of this class takes a nominal form of the same case and number that would take the verb complement in a synonymous sentence of this same structure, granted with an explicit and distinct verb and complement. For example, (26) would be such a synonymous sentence of (27): (26) sapanngamik kusanatumik pisivuq pearl-INST. nice-INST. He/she bought somthingINST.IND.3sg. ëHe/she bought his/herelf a nice pearlí (27) kusanatumik sapangarsivuq nice-INST. He/she bought a pearl-INST.IND.3sg. ëHe/she bought his/herelf a nice pearlí where sapanngamik, complement of the verb pisivuq, is singular and in the instrumental case. Kusanatumik, modifier of sapanngamik, agrees with the latter in number and gender. In (27) the verb
Some Advantages of Linguistics without Morpho(pho)nology ! 47
complement kusanatumik is also singular and instrumental. According to Sadock, it can only be so for the same reason as in (26), that is, because it agrees with the verb uqís object complement, to which it has incorporated itself, i.e., even if, during the incorporation operation, sapanngamik had transformed itself into sapangaq which would be, according to Sadock, the suffixal form of sapanngamik, it still preserves some of its properties, in particular its number and case. To describe the formal difference between the two words, Sadock opts for an explanation which considers the level of application of the operation: if the formal marks of number and case are not present, it is because the word which must bear these marks only realizes itself once the appropriate moment arrives in the generation. Sadockís grammar must therefore use two levels of syntactic operations, one which includes incorporation and is prelexical (so, according to his use of the term, ëpre-morphologicalí), and the other post-lexical and capable of inheriting the marks of case and number, which are not, in Sadockís mind, generated morphologically, but rather syntactically. Curiously, neither reduplication of syntax, nor the fragmentation of morphology in pre- and post-lexical morphologies seems to bother him at all, in spite of his concern for ëeconomyí. To sum up, his generation of kusanatumik sapangarsivuq ëHe/she bought his/herself a nice pearlí is the following: (7) Derivation of kusanatumik sapangadsivuq ëHe/she bought his/herelf a nice pearlí Pre-lexical syntax Nominal incorporation V²
D Structure V² N¢
V¢ N¢
N
V
V¢
N² N
Pro sivuq N¢ N¢
V
Pro Sapangaqsivuq
N
N¢ N¢
N
kusanatuk [+instr., +sing.] ëniceí ëbuyí ëpearlí Sapangaq
N²
N t
N kusanatuk [+instr., +sing.]
48 ! Alan Ford and Rajendra Singh Post-lexical syntax Grammatical agreement V²
Lexicalization (morphology 1) V² N¢ N
N¢
V¢ V
N
N²
V¢ V
N²
O
O
N¢ N¢ Sapangaqsivuq
N¢ N¢
Sapangaqsivuq N N t kusanatuk [+instr., +sing.]
N t
N kusanatuk [+instr., +sing.]
Lexicalization (morphology 2) V² V¢
N¢ N
V
N²
O N¢ N¢ Sapangaqsivuq N O
N kusanatuk [+instr., +sing.]
Sadock concludes: If (27) were derived either from a structure very much like (26) or from a somewhat more abstract structure into which the empty stem had not yet been inserted, then the case for the modifier would automatically be assigned by independently-needed rules. However, the generalisation that is so obvious between (26) and
Some Advantages of Linguistics without Morpho(pho)nology ! 49
(27) would be obscured if the object incorporating verbs had to appear fully formed in deep structure. (Sadock 1980: 307) In his mind, in this case, redundancies would be created on the grammatical level: on the one hand an ëad hocí syntactic rule to insure case attribution, on the other a semantic rule, also ëad hocí, to interpret the explicit complement as a modifier of the incorporated complement. According to the solution we propose also, neither one of these redundancies is necessary, but we dispense with two syntactic components, contenting ourselves with a unique and unified morphological one. In Inuttitut the morphological link between the two words sapangaq and sapangarsivuq is expressed with the strategy in (8): (8) [X]abs.s [Xsivuq]intl.3s. The difference in form between sapangaq+sivuq, which is the morphological output, and sapangarsivuq, its phonetic form, is the result of a phonological wellformedness condition of Inuttitut, *qs. Such a situation is phonetically stabilized by realizing /q/ as [r]. On the syntactic level, pisivuq and sapangarsivuq are two verbs which take an optional complement in the instrumental. On the grammar of these forms, it seems to us that there is nothing to add, only that the case morphology that allows one to identify sapangamiq as the instrumental form in the singular corresponds to the absolutive form sapangaq of the word which designates a decorative pearl. It is given, among others, by the strategy in (9): (9) [Xq]abs.s [Xmiq]1inst.3s. B. Possession: a verb of this class can take an explicit nominal complement in the relative case, which is interpreted as the ëpossessorí of the underlying verb complement, as in example (33) where tuttup, in the relative case (possessorís case in Inuttitut), is the possessor of an underlying niqaanik ëmeatí: (32) tuttup caribou-REL. (33) tuttup caribou-REL.
niqaanik meat-INST.3sg. niqitupunga eat meat-IND. 1sg.
nirivunga eat-IND. 1sg.
According to Sadock the verb niqiturpunga of sentence (33) would be of the same kind as sapangarsivuq of (27). That is to say, that its
50 ! Alan Ford and Rajendra Singh
stem is the result of an incorporation operation of the noun niqaanik, explicit in (32) where it is found in the singular form of the instrumental case as a possession of tuttup, its possessor in the relative case. In (33) tuttup also figures in the relative case as a possessor of the incorporated form of niqaanik which would be the niqi- of the verb niqiturpunga in (33). According to Sadock, the position of tuttup, which obligatorily precedes the verb in (33), would also be significant, because ëpossessor nouns are in the relative case and stand before possessed nounsí while, as he notes on many occasions, the order of syntactic constituents in Greenlandic is usually free. He deduces that ë[t]he fact that incorporated nouns can be possessed is exceedingly important evidence for a syntactic incorporating rule, since sentences like (33) could not otherwise exist.í He concludes: Ex. (32) shows an independent verb and an object NP. The object consists of a possessor in the relative case, followed by the possessedówhose inflection indicates the person and number of the possessor, as well as the case of the entire NP. In (33), however, we find a denominal verb; but there is still a possessor, and the incorporated noun is understood as possessed. Note in particular that the case of the possessor is relative, just as it would be in an overt possessed-possessor construction. Obviously, if (33) is derived from a structure very much like (32), the case of the possessor, as well as the semantics of the sentence, is accounted for directly. (1980: 309) According to our analysis which brings back to morphology what Sadock recuperates for syntax, a verb like niqiyurpug ëto eat some of its meatí links itself morphologically to the noun niqi ëmeatí by the strategy in (10): (10) [Xq]abs.s [Xturpuq]ind.+rel.3s where the addition of the property to take an optional complement in the relative case is an integral part of the operation of word formation. Of course, our analysis offers no answer, as Sadock can pretend to do so, to the question of the reason for this addition, but we represent the phenomenon the way it happens and what appears to be the essential of the linguistic context where it occurs. The answer to this question, from our point of view, lies elsewhere, more precisely on the level of the lived experience of the speaker and his
Some Advantages of Linguistics without Morpho(pho)nology ! 51
inferences: if tuttup niqiturpunga, then there is in my universe a link between I am eating and the caribou. It would be surprising if it wouldnít be the link that we qualify the most often, at least that Sadock qualifies, as a link of ëpossessioní. A bit in the same way that from the expression I am eating, we deduce that I am eating food. It seems to us that the Inuktitut, when he hears tuttup niqiturpunga, deduces that the meat I am eating is caribou meat, rather than interpreting that this animal holds another type of relation with the event. He must at least ask himself what the caribou is doing in this context if that isnít his role. If his house is upstream from mine, there surely is a river of some sort in the area, even if that isnít explicit in the sentence. It is in this way that, to answer the troubling question Sadock seems to be asking himself, we use a natural principle of discourse economy, a bit like Grice: what is assumed is never explicitly cited (or if it is there, it signifies a completely different thing). C. Reference: Inspired by the notion of anaphoric island which is a property Postal (1969) attributes to the word, Sadock (1980) affirms that ëWe do not expect pieces of words to have independent referential or discourse properties.í However, he notes that ëIncorporated nominals in Greenlandic have the same kind of semantic/pragmatic status as independent nominals would be expected to have.í In Sadock (1991), his conviction seems weakened and he claims that, ëwhile it is frequently the case that noun incorporation is accompanied by lack of semantic/pragmatic autonomy of the incorporated nominal (particularly where there is no indication of the syntactic independence of the nominal), it is not always the caseí (ibid. p. 86). In order to illustrate this, we are given example (23): (23) kisiannimi usi nassataqarpunga katersuriarlugit ingerlaannarlunga Mais/en effet/jíen ai des bagages/les ramasser 3p/jíai juste à aller-IND. 1s. ëMais, en effet, jíen ai des bagages, je níai quíà aller les ramasser.í where the ënouní, said to be incorporated into the verb, is coreferential with the 3p complement explicit in the morphology of the verb katersuriarlugit. To insure the co-reference between this morphological mark and the incorporated complement of the verb nassatataqarpunga, Sadock tells us, the complement of this verb should take the form of an explicit N² in D structure.
52 ! Alan Ford and Rajendra Singh
To sum up the property which Sadock identifies under the generic of modification: he believes that an intransitive verb like sapangarsivuq takes an implicit complement in the instrumental case, because there exists a transitive version of this verb which takes a complement in the instrumental case. However, other verbs exist in Inuttitut which take a complement in the instrumental case, and which do not have a version corresponding to a verb having an explicit complement in the instrumental; examples are given in (11): (11) tingummik alallijunnijaruk du foie-INST. síil te plait apporte le lui ëApporte-lui du foie síil te plaití aalisakkani ilinnut nuwaisitsiiqquwai du poisson-INST. ëIl te demande de mettre du poisson dans son ventreí inungnguwanik piingasungnguwanik takungngilatit petites personnes-INST. ëNías-tu pas vu trois petitsí Why wouldnít this property to take the instrumental, which the ë1í on the right of the arrow of our strategy (9) symbolizes, result, if it is the case, from the morphological operation which creates the verb? As for what Sadock calls ëreferenceí, the problem with his analysis is that there is nothing in the facts signalled which imposes the incorporation analysis he gives. Sapangarsivuq is a morphologically intransitive verb which, syntactically, takes a complement in the instrumental case. The representation of such facts is justified by the fact that there already exist many verbs from this class in Inuttitut, as we have indicated in (9). These forms result at least in part, according to Bergsland (1956), from a morphological operation which links them to transitive verbs without the help of any incorporation operation. The latter says the following about it: An instrumental form may have the force of a more or less indefinite object or a remoter object, in combination with certain intransitive or intransitively used verbs, especially verbs with an intransitivizing (medializing) derivational suffix, e.g., niqinik mayuussillutik ëto bring up the meatí inst. (ibid. p. 76) Most transitive verbs, including verbs transitivized by the suffixes mentioned in 63ñ64 and 66 may have their object turned
Some Advantages of Linguistics without Morpho(pho)nology ! 53
into an instrumental term [cf. example aboveóF & S]óand, consequently, their dependent subject turned into an annexed oneóthrough a medializing suffix. (ibid.: 108ñ9)6 We deduce from this that within the operations concomitant to a category change during a morphological operation, in our sense, it is possible that an attribution or change of syntactic properties occur, like, for example, the case taken by the verb, without there being question of a syntactic operation, a bit in the same way that for a suffixation operation to create a noun from a verb, for example in (12), it automatically attributes the masculine gender to the French noun: (12) [X]V3sg. [Xmã]Nm.sg. for example in the case of words like abonnement, boulversement, changement, débrouillement, etc., the operation (13) creates in Inuttitut a morphologically intransitive verb which takes the instrumental from a transitive verb which takes on the absolutive. (13) [Xpaa]tra.3s [Xpuq]int.3s Even if, during the incorporation operation, sapanngamiq was transformed into sapangaq which would be, according to Sadock, the suffixal form of sapanngamiq, it still keeps some of its properties, in particular its number and case. Itís not because an operation manipulates syntactic category features that it is, ipso facto, of syntactic nature itself; a morphological operation has the habit of doing the same. Other generativists have also questioned the autonomy of morphology on the basis of syntactic analyses of phenomena analogous to nominal incorporation in other languages, e.g., Baker (1988) for Mohawk and Lieber (1992) for a variety of languages. Our arguments contra Sadock can be easily extended to take care of these other cases. We shall, therefore, turn our attention to the other assault which morphology undergoes, and which comes from another angle. In the same way that syntax allowed itself to take over part of the legitimate domain of morphology, phonology too has given itself as a task to appropriate a naturally morphological terrain. An important aspect of our theory of relationships between morphology and phonology is that it excludes all possibilities of morphological information intervention in phonology proper, which means concretely all mention of morphological structure in the formulation of
54 ! Alan Ford and Rajendra Singh
phonotactic wellformedness conditions (WFC). This hypothesis is directly put to test by the notion of morpheme structure condition (MSC), strongly defended by many linguists, in particular Mohanan (1996: 45), who concludes that the construction types I have demonstrated to be relevant for phonology include the following types of information: morpheme (formatives and features), word, stem, affix, type of affixation, head, modifier and complement.7 With the exception of the concept of ëwordí, which is as fundamentally an element of the phonological structure as it is of the morphological structure, in the sense that all phonological WFC specifies its scope, either syllable or word in its formalism, the morphological elements which Mohanan mentions do not have their place in phonology. To convince ourselves some more, let us look closer at the arguments put forward by Mohanan in favour of the position he defends. It turns out, these arguments can be reduced to a set of illustrations of descriptions of phonological structures requiring each an MSC. The first phenomenon concerns the existence in English, of words of the form in (14): (14) act, lift and risk and the absence of others of the form in (15): (15) *zbin, *rizg and *lizb This coupled with the fact that certain words like twelve, which have a voiced consonant in absolute final position, devoice this consonant during the addition of a morphological constant which starts with a voiceless consonant, for example in twelfth, while others, which end in a voiceless consonant, like cat, devoice the initial voiceless consonant of an added suffix, for example cats [kats], to pronounce the morphological structure of a plural, which is /katz/, cat plus the /z/ suffix. Mohananís solution to this set of facts is to postulate the existence of the MSC (16): (16) ëAdjacent obstruents within a syllable are voiceless. (Absolute morpheme internally, weak across morphemes).í The problem with this solution is, first, that it excludes from language a great number of words which occur in many speakersí speech. We are thinking, for example, to family names like Briggs, Dodds,
Some Advantages of Linguistics without Morpho(pho)nology ! 55
Higgs, Hobbs, Mabbs, Tibbs and Woods, as well as well-known locality names such as Hyves and Agde, very popular places in the UK, and frequently employed expressions in certain dialects like a diggs to designate an apartment, typical in London dialects and probably elsewhere, and hibs nibs, a frequent expression of Dickens and still used in many dialects to designate the devil and, by extension, for some people, a person which is being talked about, but of which the name is being avoided on purpose, generally so that this person or another doesnít become aware of the object of the conversation. Also, (16) doesnít allow the creation of neologisms such that was created not so long ago in the case of a word like AIDS. This example of an MSC from Mohanan gives us the ideal opportunity to illustrate how our approach allows one to solve the problem without any of the inconveniences caused by his solution, and especially without the use of any MSC. The phonological WFC of English which is pertinent here is (17), and its violation is repaired by a progressive assimilation, as illustrated in a case like cats. (17)
*Coda
ñvoice
+voice
A word like twelfth, on the other hand, results from the application of a morphological strategy which, while suffixing the morpheme /-q/, devoices the preceding consonant. Notice that this transformation of the consonant is not of phonological origin, as we could have imagined, because it does not depend on any WFC, the sequence [vq] being well attested in words like givth, livth, third person singular, archaic or dialectal, of the present of the verbs give and live. As our own examples illustrate, there is no restriction on the codas of syllables of the form [bz], [dz], [gz], and the morphology of Mohananís examples takes the form of the two strategies (18) and (19):8 (18) [X]N.sg. [Xz]N.pl. cats, dogs, etc. (19) a. [X/+voice/]N.sg. [X/-voice/q]Adj. twelfth b. [XC]N.sg. [XCq]Adj. fourth, fifth, sixth, etc. In what follows, we present the arguments against the other MSCs of Mohanan.
56 ! Alan Ford and Rajendra Singh
His second example takes the form of (20a): (20) a. Morpheme internally in English, a coda can have at most three consonants. The phenomenon which this condition aims at explaining is the fact that, in the case of words resulting from suffixation, there are codas from syllables having more than three consonants, for example in the case of a word like texts, which our strategy (16) could generate. Our research has not led us to find a counter-example to the condition in (18a), which doesnít mean that it remains true, because it isnít our intuition that a native English speaker could not pronounce words of the forbidden form. There just arenít any, at least to the best of our knowledge. Imagine, for example, that a company incorporated under the name American Continental Shipping and Transport Services is created. Would the speaker be able to talk about an ACSTS truck or to say ACSTS shares rising at the stock market? The speakers with whom we have verified the example, as well as others of the type: Universal Turbine Throttle Cleaning Service (an UTThCS van) donít have the impression that these words escape all pronunciation possibilities, although they arenít easily pronounceable right away. A third example of an MSC from Mohanan, which also concerns English, is (20b): (20) b. In English, a consonantal segment can be syllabic only at the end of a morpheme/stem. For this condition to be respected, a word like Middlesborough or Edinborough would have to be structured as in (21), where ë//í (the double backslash) represents a morpheme boundary and ë/í (the simple backslash) a syllable boundary: (21) Middlesborough /mId/lz/br ↔ /
edinborough /e/dIn/br ↔ /
For this to be the case, a word-formation strategy of the form in (22) would have to exist: (22) [Xq]? [Xbr« ]N.pl. The problem is that, according to our theory, for (22) to be a wellformed strategy, /mIdlz/ and /edIN/, the elements to which the suffix /br« / is added, would have to be words, which implies that
Some Advantages of Linguistics without Morpho(pho)nology ! 57
they be elements identifiable as such, and belonging to a same morphological category. This not being the case, the strategy in (22) becomes suspicious and probably does not belong to the morphology of English, since the category signalled in (22) by a question mark remains undeterminable. So from our viewpoint, these words constitute counter-examples to the MSC (20b), although Mohanan, not sharing our view of morphology, will probably not find this objection valid. The fourth example of an MSC from Mohanan also concerns English: (20) c. Morpheme internally, in English, dental fricatives cannot occur after an obstruent in a coda. It seems to us that a word like plimpth constitutes a counter-example to this condition, and there probably are others. Other reproaches concern technical points of the form of the strategy itself, of the information it actualizes and of its organization within the morphological component. Some blame us for the absence of an explicit place for what has been called ëproductivityí for a long time. On this point, we take the opportunity to identify all forms of ëproductivityí as a statistical construction which should not reflect anything about morphological operations and their workings, just like it doesnít reflect anything of the functioning of phonology either. All our strategies are equally productive in the sense that they apply where they can. The nature of their competition is partially determined by the form of the strategy which determines its range, but in the case of real direct competition between two strategies having overlapping ranges, the determining factors must be extralinguistic: there is nothing in the grammar which forces us to say he strove rather than he strived, deux chevaux rather than deux chevals, chupa-ajis rather than chupa-ajises, but there can be many extralinguistic factors which contribute to it. This competition between morphological strategies constitutes not only their distinctive feature when compared with the phonological reparation strategies, which, in contrast, are always absolute, but also gives its fatal blow to morphonology. If the formal alternation between goose and geese is explained by an operation said to be morphonological and the one between duck and ducks by a morphological strategy, how can we account for the forms gooses and geeses which characterizes the speech of most English speakers at a certain
58 ! Alan Ford and Rajendra Singh
time in their learning? However, if both alternations are purely morphological, the variation becomes natural. An aspect of our morphology which seems not to have been well understood is the fact that it does not appear to account for so-called compound words, that is, words which, in other analyses, come from the association of two words. Under this angle, it is true that our morphology does not generate any word of this nature, and this simply because there arenít any, in any language. Words of the type ouvrebo˘te, nœud papillon or mainlevée, which have been so qualified in French, are not formed of two other words, but rather, in conformity with our morphological strategy, of a constant which, in these examples, takes a form which looks like that of a real word which, as such, could well be found in another context (this corresponds in our three examples to the forms ouvre-, -papillon and main-), and a variable which, as a member of a morphological category, is necessarily and by definition a word. Hence, among the morphological strategies of French, the three word formation strategies (23), (24) and (25) can be found, which generate respectively and among others the words ouvre-bo˘te, nœud papillon or mainlevée : (23) [X]N [uvrX]N m. (24) [X]N [Xpapij÷o]N (25) [X]Adj. fem. [m÷eX]N fem Another objection to this unified account of the morphological operation concerns the bi-directionality of the strategies. We have justified elsewhere9 the existence of this property, but these arguments can be reinforced by underlining an aspect of the question on which we have not insisted in our article, that is, the fact that the bi-directionality of the strategies allows us to evacuate completely the notion of ëzeroí as a morphological mark, and at the same time to characterize the similarity relationship which always constitutes a competing strategy, just like any other. Thus, we reestablish the Aristotelian order overthrown by structuralism in the way we have mentioned in the beginning of this article. Let us mention, besides this advantage given by our morphology, other advantages which also help solve certain problems which have long haunted many linguists. The first concerns the question of accent placement and its place in grammar. It is clear that, for languages which know a variable accent position, this variation depends, in our view, either on phonological factors, or on morphological
Some Advantages of Linguistics without Morpho(pho)nology ! 59
factors. In a description from a theoretical framework which possesses a morphonologyóthe classical examples are those of Latin and Englishóthe accent is either morphonological or phonological. It seems that the supposedly phonetic basis of these pretended sub-domains of grammar has continually solicited the search for a unified theory of accent placement. By reunifying morphonology with morphology, we can hope to kill this temptation. It is clear that in this type of language, accent placement often figures as an integral part of a morphological operation in the sense in which we understand the term, which sometimes constitutes by itself the formal difference between two words: pego ëI hití and pego¢ ëhe hití in Spanish or protestV protestN in English are classical examples. By separating these clear examples of morphological accent placement from cases of phonological accent placement, for example the WFC limiting to only one, the number of accents in a Spanish word or the WFC which, in the same language, forbids all accents farther than three syllables from the end of the word, we hope to encourage certain people to abandon the search for a unified theory of accent placement in these languages and to concentrate their energies on more realistic objectives, that is, realizable ones. Finally, notice how the fusion of a word-formation strategy in a unique operation with the operation said to be morphonological which, in an analysis which maintains morphonology, it is suppose to trigger, allows us to explain why we never observe the application of one without the other, which constitutes in our view a heavy argument against the identification of an autonomous morphonological component. An important factor in favour of this hypothesis comes from diachronic linguistics, which seems to offer no support to the possibility that the ëphonologicalí part of an operation said to be morphonological could extract itself from the context and generalize. As for the putative counter-examples to the fact that morphonology doesnít generalize diachronically, given the fact that Malkiel (1982: 248) has withdrawn his Spanish example of 1976 (cf. Mendez Dosuna and Pensado 1986 also), there are not many left, and we should perhaps limit ourselves to the recent and almost convincing case presented by Morin, Langlois and Varin (1990), which we have treated in Ford and Singh (1991). These authors pretend that • ® o, the undisputed ëmorphonologicalí alternation, in their view, of 17th century French, has generalized to become what we all estimate
60 ! Alan Ford and Rajendra Singh
to be a global process, purely phonological, for example the fact that the alternation [s• ] ëstupideísg. [so] ëstupideípl. has neutralized to [so] in both persons. Concerning this phenomenon, four points are clear: (i) it was ëmorphophonologicalí in the sense of Morin, Langlois and Varin, in the 17th century and thereon; (ii) it is now phonological in the sense in which we all understand it; (iii) it has never generalized, in the sense that it never came to signal a new morphological opposition, neither has it ever incorporated within a word-formation operation with which it has been associated; and (iv) it has undergone a lexical spreading, first within its original category and, later, in an inter-category way as a ëlexical redundancy ruleí. The impression that it gives to have transformed in a ëphonological ruleí is in great part a result of the fact that the other major category, the verb, did not know any form in final • (Morin, Langlois and Varin 1990: 516). The generalization of a ëmorphophonological ruleí as a ëphonological ruleí must, it seems to us, implicate the appropriation of other morphological categories, which is not the case when • is tensed, because this tension passes from plural to a more and more trans-category ëredundancy ruleí, to end as a WFC, without ever being triggered by another morphological operation. Although Morin, Langlois and Varin are still right in the sense in which this operation (or, more precisely, the neutralization which it accounts for) is phonologized, their example seems to be a case of phonologization without generalization of the type that we estimate a criterion of the phenomenon which we are looking at. It seems to be a case of global decomplexification rather than the generalization of a ëmorphophonological ruleí as a ëphonological ruleí. The phenomenon has not penetrated other morphological operations, but has still reached the status of an automatic alternation, at least for an imaginative speaker. Even if it is one of the most seriously documented studies of what the authors estimate to be a progressive dismantling of the morphological and lexical conditions of a ëmorphophonological ruleí, there is no need, fortunately, to interpret it as a clear case of a ëmorphophonological ruleí which violates our 1983 restrictions.10 The case presented by Morin et al. aims at questioning Hooper (1976: 91), who excludes the dismantling of conditions on a ëmorphophonological ruleí as a possible evolution. The necessary conditions for the dismantling to lead to a ëphonologizationí of an
Some Advantages of Linguistics without Morpho(pho)nology ! 61
alternation which Hooper qualifies as an ëMP ruleí calls for a narrow consideration. Because even the type of example raised by Morin et al. is rare. Maybe the true importance of the example under consideration is to be found in the fact that we make the mistake to raise in the same context synchrony, the communicative extraction capacity and the diachronic social generalization possibility. The speaker doesnít extract anything; we are rather in the presence of subtle factors which produces a situation which canít be empirically distinguished from that which Hooper is trying to exclude. The rarity of such a development is here but a consequence of the fact that the happy combination required does not occur too often. The generalization of a ëmorphophonological ruleí in the forbidden sense mentioned still remains unattested. Maybe one should highlight the crucial point which is that the so-called ëphonologicalí part of a complex rule which incorporates both an affix and a non automatic alternation cannot be extracted nor abstracted away in a way that it becomes available to associate itself with another affix with which it previously ignored all association. That is what we wanted to signal when we indicated that -ito or -ita could never trigger a monophtongization in Spanish. It might be possible, where no such extraction is required, that this part of morphonology generalizes with the help of the frequent rule of competition between morphological strategies. The crucial point is that the so-called ëphonologyí of a morphologically complex operation is an integral part of it and could never isolate nor separate itself from it. Diachronic linguistics confirms our hypothesis of non-separability. That was the sense of our remark saying that ëmorphophonologyí was not mobile, and we donít see why we should move away from this position. In an analogous way, our morphology allows us to account for the behaviour of speakers in a language contact situation or during the acquisition of a second language. We know that in contexts of acquisition of a second language and of languages in contact in general, what we have called ëmorphonologyí, and morphology as we understand it, behave in the same way: none of them seems to create what we often call ëinterferenceí. In the context of the acquisition of a second language, the ëmistakesí that the learner is supposed to produce come from the phonology and the morphology of his first language (cf. Singh 1991, Singh and Ford 1987, Singh and Martohardjono 1988, and Singh and Parkinson, in press). In the
62 ! Alan Ford and Rajendra Singh
context of what we often call languages in contact, phonological and morphological adaptation are reinforced under the influence of the phonology and morphology of the borrowing language. In both contexts, morphonology behaves like the bound morphemes of Moravcsik (1978: 110); put differently, they do not move without being accompanied by the words to which they belong. It is precisely because of this way of moving that the morphology of a set of words borrowed to an undetermined number of languages morphonologically behave in a way which looks a lot like its behaviour in its original language. In our view, a set of borrowings, phonologically and semantically related, gives place to a word- formation strategy which shouldnít vary from a source language to another. The limit of their variation is that determined by their phonology. It doesnít matter much that the phonological difference is at the level of the features, segments, prosody or affixes. Once the strategy available, it can serve to produce words which for certain speakers must always put in evidence their morphological complexity in a different way. Ill-formity in English is at least as possible as priesthood, and mongeese as possible as mongooses. Whatever the status of Moravcsikís universal principle #2, it doesnít seem to impose distinctions between morphonology and morphology as far as second language acquisition nor languages in contact are concerned. As a bonus, it follows that the dissociation of morphonology from phonology, while still allowing us to account for empirical facts, gives us a much more restrained phonology than that which results from keeping the link between both. To conclude, we estimate that the objections formulated towards a morphological component destined to feed phonology directly, without any intermediary intervention, is not in any case valid and that it is desirable to pursue without any fear of an important loss of information the use of this model for linguistic description.
Notes 1. This is a translation, prepared by Lue Baronian (Stanford University), of the French original which appeared in Singh ed. 1996. We are grateful to the translator for his labour.
Some Advantages of Linguistics without Morpho(pho)nology ! 63 2. Therefore, we answer concretely a theoretical objection of Sadock for whom such a grammar would not have its raison d¢être (cf. also Mohanan 1995: 44ñ45). We believe that the question of ëwhich component feeds which componentí (Sadock 1991: 10) is not as peripheral as Sadock pretends. 3. These corpuscles are of course words here. If we dispense with this term, it is because it knows many uses in linguistics which are not all mentioned here. In particular, we would like to avoid making reference to a fact which we find in many languages and which constitutes a syntactic operation of word-formation, often taking the form of the addition of an enclitic particle to a word, and which it is important not to confuse with a morphological operation which is clearly pre-syntactic, like the internal character of affixes versus enclitics demonstrates. 4. For Sadock, the stem which, depending on the grammar, underlies a word like qimmiqarpuq comes from a tree structure of the form: V²[V¢[-arpuq] N¢[qimmiq]] where [-arpuq] would be an ëabstractí verb. 5. For all the examples of Greenlandic cited here, we keep the numeration of the examples from the articles of Sadock where they are taken. In the text of these examples, we replace the spelling used by Sadock by a phonological transcription which we need to formulate our morphological strategies. Therefore, the vowels e and o of Sadockís examples are respectively rendered as i and u in the examples cited. 6. The numbers in the citation from Bergsland refer to examples which we reproduce here under (9). 7. Actually, he goes farther when he claims that ë[t]he evidence I have reviewed above suggests that phonological principles need to refer directly to morphosyntactic constructs such as the morpheme, head, complement and modifier...í (pp. 45ñ46), but since we defend the position that there is no more morpho-syntax as there is morphonology, we wonít take here the necessary space to explore all the implications of these remarks. 8. Although the word twelfth seems to illustrate the phenomenon of hapax legomena to which we referred above. 9. Ford and Singh (1985). 10. The restrictions in question were formulated as follows in Ford and Singh (1983: 67): ëFourthly, morphophonological alternations do not act independently in the historical evolution of a language. We do not wish to affirm by this that one cannot witness the extension of a morphophonological process from generation to generation as the opposite is easy to attest. Neither do we wish to affirm that a morphophonological alternation could not be produced spontaneously during the evolution of a language. What we wish to affirm
64 ! Alan Ford and Rajendra Singh is that a morphophonological alternation is not dynamic in the sense that it can wander from one morphophonological context to another. For example, weíd like to claim that the mark of the imperfect or the morpheme -los could not trigger umlaut in German and that diminutive -ito/-ita could not trigger a monophtongization in the Spanish noun. We think, until proof to the contrary is offered, that historical changes of this form are impossible and that a morphophonological process cannot be mobile in this sense. Fifthly, a morphophonological alternation can spread without there being present the phonological conditions that were present at its appearance.í
References Baker, Mark C. 1988. Incorporation: A Theory of Grammatical Function Changing. Chicago: University of Chicago Press. Bergsland, K. 1956. A Grammatical Outline of the Eskimo Language of West Greenland. Oslo: Skrivemaskinatua Stortingst. Bohm, D. and B. Hiley. 1975. ëOn the intuitive understanding of nonlocality as implied by Quatum Theoryí. Foundations of Physics. 5: 93ñ109. Bybee, Joan L. 1980. ëExplanation in morphophonemics: Changes in provençal and Spanish preterite formsí. Lingua. 52: 271ñ312. Ford, A. and R. Singh. 1983. ëOn the status of morphophonologyí. In The Interplay of Phonology, Morphology and Syntax, ed. by J. Richardson et al. Chicago: Chicago Linguistic Society. 63ñ78. óóó. 1985. ëTowards a non-paradigmatic morphologyí. In Proceedings of the Elereile Berkeley Linguistic Society, ed. by Mary Niepokuy et al. Berkeley Linguistic Society. 87ñ95. óóó. 1991. ëPropédeutique morphologiqueí. Folia Linguistica. 35: 549ñ75. Hooper, Joan B. 1976. An Introduciton to Natural Generative Phonology. New York: Academic Press. Lieber, Rochelle. 1992. Deconstructing Morphology. Chicago: University of Chicago Press. Malkiel, Yakov. 1982. ëInterplay of sounds and forms in the Spanish of three Old Spanish medial consonant clustersí. Hispanic Review. 50(3): 247ñ66. Mendes Dosuna, Julian and Carmen Pensado. 1986. ëCan phonological changes have a morphological origin? The case of Old Spanish ie > i and ue > eí. Diachronica. 3: 185ñ201.
Some Advantages of Linguistics without Morpho(pho)nology ! 65 Mohanan, K.P. 1995. ëThe organization of the grammarí. In The Handbook of Phonological Theory, ed. by J. Goldsmith. Oxford: Basil Blackwell. 20ñ69. Moravcsik, Edith A. 1978. ëLanguage Contactí. In Universals of Human Language, ed. by Greenberg, Vol. 2: 93ñ122. Stanford: Stanford University Press. Morin, Yves-Charles, Marie-Claude Langois and Marie-Ève Varin. 1990. ëTensing of word-final [ˆ] to [o] in French: The phonologization of a morphophonological ruleí. Romance Philology. 43: 507ñ28. Postal, Paul. 1969. Aspects of Phonological Theory. NY: Harper and Row. Sadock, Jerrold M. 1980. ëNoun incorporation in Greenlandic: A case of syntactic word-formationí. Language. 57: 300ñ319. óóó. 1991. Autolexical Syntax: A Theory of Parallel Grammatical Representations. Chicago: University of Chicago Press. Singh, R. 1991. ëInterference and contemporary phonological theoryí. Language Learning. 41(2): 157ñ75. óóó. ed. 1996. Trube∂zkoyís Orphan. Amsterdam: Benjamins. Singh, R. and A. Ford. 1987. ëInterphonology and phonological theoryí. In Sound Patterns in Second Language Acquisition (= Studies on Language Acquisition, 5), ed. by Allan James and Jonathan Leather, pp. 163ñ72. Dordrecht: Foris. Singh, R. and D. Parkinson. 1995. ëL1, L2, and intermorphologyí. Rivista di Linguistica. 7(2): 369ñ89. Singh, R. and G. Martohardjono. 1988. ëIntermorphology and morphological theoryí. In Linguistic Theory and Second Language Acquisition, ed. by Wayne OíNeil and Susann Flyn, pp. 362ñ83. Dordrecht: Kluwer.
66 ! Rajendra Singh and Alan Ford
3 In Praise of ›aka¢åyana: Some Remarks on Whole Word Morphology1 Rajendra Singh and Alan Ford Introduction Påƒiniís A¶¢dhyåyi is, as Thieme (1971) points out, an extended argument, presumably contra ›aka¢åyana2 and others, that a grammar can be built up with small units, particularly in the domain of morphology. For Påƒini, the word is an entirely derived entity, something made up of smaller pieces, put together according to the combinatorics he provides. The parts that enter into his combinatorics are according to him all real (cf. Deshpande 1997). Both his atomistic ontology and his methodology have been questioned in the immanent critique that begins with Pata¤jaliís insistence on nityatva and ends with Bhartrihariís demonstration that words are seamless wholes and that the parts Påƒinians delight in coming up with are at best grammatical fictions (cf. Singh 1998). Bhartrihariís critique is, however, ignored by mostóeven his defenders, such as Kelkar (1999), see him primarily as a philosopher of language, obviously making a distinction he would have abhored, and do not draw what seem to us to be obvious grammatical conclusions from his insistence, for which he provides several substantial arguments, on seamlessness (for some recent examples of insistence on (a somewhat differently motivated and grounded) seamlessness, see Starosta (1999 and in this volume). Post-Renaissance grammatical practice in the West almost completely abandons the Greco-Roman construal of morphology as a study of relationships of shapes of whole words, and ends up, perhaps aided in this transition by its increased exposure to Hebrew grammarians, adopting the Påƒinian position, later espoused by leading structuralists from Saussure (cf. Singh 1992 and Vajpeyi 1997) to Bloomfield. Despite some modern attempts to
In Praise of ›aka¢åyana ! 67
revive the ancient Greco-Roman practice (cf. Matthews 1974 and Robins 1959, in particular), the Påƒinian way of doing morphology has been dominant for centuries now, possibly because a fully formalized full-fledged alternative had not been made available until recently (neither fish nor fowl attempts such as Anderson [1992] actually end up supporting the Påƒinian view, as Sadock [1995] is happy to note). The purpose of this short dialogue-initiating note is to outline the alternative that has been available at least3 since Ford and Singh (1991), to show some of its applications, and to invite South Asianists to tell us why the Påƒinian view of morphology should be preferred. Hoping to shift the burden of proof, we shall concentrate not on the critique of that view, best characterized as morphemology (Janda 1983), but on the presentation of whole-word morphology.
The Theory All that needs to be said about word structure in any language (of any type whatsoever) can and must be said by instantiations of the schema in (1) below. We refer to these instantiations as W(ord) F(ormation) S(trategies) because as generalizations drawn from known particular facts, they can be activated in the production and understanding of new words (cf. Ford and Singh 1991 and Ford, Singh and Martohardjono 1997).4 WFSís must be formulated as generally as possible, but, and this is crucial, only as generally as the facts of the matter permit. (1) /X/a /Xí/b where: a. X/a and /Xí/b are words and X and Xí are abbreviations of the forms of classes of words belonging to categories a and b (with which specific words belonging to the right category can be unified or on to which they can be mapped) b. í represents (all the) form-related differences between /X/ and /Xí/ c. a and b are categories that may be represented as feature-bundles d. the represents a bi-directional implication (if X, then Xí and if Xí, then X)
68 ! Rajendra Singh and Alan Ford
e. Xí is a semantic function of X f. ë can be null if a ¹ b
Some Consequences It should be obvious that (1) above denies both intra-linguistic (inflections vs derivation, affixation vs compounding etc.) and interlinguistic (flectional, isolating etc.) morphological diversity, and offers a unified account of what have sometimes been seen as different types of morphologies. The diversity that exists can be read off the system of strategies that instantiate (1) above, but it does not need to be expressed as a difference in type: a difference in content does not constitute a difference in form (of rules or strategies). (1) also denies any theoretical status to descriptive labels such as ëconcatenativeí, ënon-concatenativeí, ëaffixalí, ëon-affixalí etc. Again, multiplicity is superficial, and resides in descriptive, pedagogical paraphrases of instantiations of (1). As all morphological relationships can be expressed by strategies instantiating (1), morphology has little or no architecture and, to change the metaphor, no traffic rules (such as krt before taddhita). Although there may well be constraints on what sorts of things can be morphologized, i.e., constitute categories relevant for a morphological description, there are no constraints on particular instances of (1), though all manifestations of (1) must, obviously, relate (single) words with (single) words.5 Representations of the speakerís knowledge of the patterns of morphological relatedness in her language, Morphological Strategies (= instantiations of [1]) are invoked only in moments of crisis, i.e., when the speaker needs to analyze or fashion a word she needs for the purpose at hand, often to meet a syntactically enforced requirement. Their exploitation, of course, helps her to bridge the gap between the actual words she happens to know and the possible words she can be said to knowóactually their existence makes the known merely a subset of the knowable. When they ARE invoked to produce what will become words, their ëoutputsí are seamless wholes, with no brackets, boundaries, or a-cyclic graph fragments in them. They are not there to be deleted; they are just not there. WFSs cannot supply these things because they do not have them. Neither the strategies nor their ëoutputsí have any syntactic constituency
In Praise of ›aka¢åyana ! 69
relationships marked in them in any fashion whatsoever. In both the active and the passive mode, they licence the words a speaker has or may come up with (in the ëon lineí mode).
Some South Asian Examples Below, we provide some examples of morphological strategies from English and other South Asian languages (the parenthetical comment draws attention to what some would like attention drawn to). When not enclosed in phonemic bars, words from languages other than English are given in their standard transliterated form (in which D stands for retroflex d) and are provided with glosses. Although more than the minimum number of words required for the postulation of a WFS (= [2] of pairs of words exhibiting the same difference) is available in each case, we provide only one pair in each case to save space. The category specifications below are also minimal (but sufficient to meet the needs at hand). (2a) English /X/n Marx (2b) English /XIk/adj critic
« «
/Xlzm/n Marxism /XIslzm/n criticism
the /s/ in criticism is a concomitant consequence and an integral part of the morphological operation that CAN be used to form nouns from adjectives terminating in /Ik/, an operation or rule that is in competition with the general rule in (2a). It is, therefore, part of the representation of the word criticism, to which the phonology of English (= the phonological processes of English) will apply, as it must. Needless to add that the Påƒinian view of phonology sees the /s/ in criticism as ëphonologically derivedí from a /k/ despite the fact that the allegedly phonological part of the operation in question is NOT generalizable beyond the morphological categories with which it is bound and in which it actually shows up). (3) Khasi
/X/n « step rising sun
/mynX/n mynstep morning
There is, apparently, some debate about myn-words being affixed words or compounds (cf. Philip 1997), a debate which presupposes that the distinction is a viable one! (cf. Tirumalesh 1997).
70 ! Rajendra Singh and Alan Ford
(4) Kashmiri
/X/v thag to cheat
/X/n thag a cheat
«
the agent noun is generally said to be derived from the verb with the help of the famous zero-suffix, despite the fact that in a very large number of cases there is not a shred of evidence to support the putative deverbal character of the agent noun, ([cf. Wali and Koul 1997: 270]: ëThe syntax of deverbal nouns is similar to that of nonderived nouns with respect to gender, number, and casemarkingí) and despite diachronic derivations that follow the opposite path. (4b) Kashmiri
/XVC/v.int. mar to die
«
/XV: C/v.caus. ma:r to kill
(This vriddhi alternation is treated by some scholars as a part of Kashmiri phonology despite the fact that inter-consonantal vocalic lengthening is never required in Kashmiri and is in fact not only associated with causativization but is the only mark of it in words like /ma:r/ ëto beatí, /ga:l/ ëto melt (causative), /da:l/ ëto removeí) (5) Bangla (a.k.a. Bengali)
/X/v likhe write
«
/Xnewa:/v likhenewa: to first write down
(/newa:/ is one of the handful of ësupporting verbsí or vectors that appear in structures traditionally described as ëcompound verbsí despite the fact that the freedom that label implies would generate far more combinations of verbs with verbs than are actually treated as ëcompound verbsí even by those who use this label. The grammatical subservience of elements like /newa:/ in such structures is demonstrated with remarkable clarity and elegance in Dasgupta (1989: 215ñ22) in particular). (6a) Hindi /Xa/n,nom,sing,masc laDaka: boy
«
/Xi/ n,sing,fem laDaki girl
As masculine nouns ending in /a/ have a straightforward feminine correspondent in /i /, there is absolutely no need to postulate an intermediate laDak, an entity whose postulation is forced by the rather
In Praise of ›aka¢åyana ! 71
peculiar architecture of Påƒinian morphology. The generalization SPEAKERS use is (6a) (cf. Singh and Agnihotri 1997). (6b) Hindi
/X/n sava:r rider
«
/ghuDX/n ghuDsava:r horse-rider
(/ghuD/ can, obviously, be called the combining form of the word /gho:Da:/ ëhorseí in this vikari ëcompoundí, but it is not clear what is gained by doing so, particularly because the form that combines is not always the combining form (see [6b]). It is hard to see why Påƒinians do not call it a ëprefixí). (6c) Hindi
/X/n ghoDa: horse
«
/Xga:Di/n ghoDa:ga:Di horse-carriage
(words like /gho:Da:ga:Di/ are standardly presented as made up of two words despite the fact that whereas /gho:Da:/ is very freely commutable, /ga:Di/ is hard or impossible to find a substitute for (cf. Singh and Dasgupta 1999). It is quite clear that despite the fact that there is no vikaar in /ga:Di /, it has lost its word-hoodóit only sounds like the word /ga:Di/, as Bhartrihari would say). (7) Sanskrit /Xen/n,masc.sing,inst « /X sya/n.masc,sing,gen ka:mena ka:masya love love (Whenever the masculine singular instrumental noun ends in /en/ the corresponding masculine singular genetive ends in /sy/, something quite directly accessible to those who know Sanskrit and yet not quite that easy to state in neo-Påƒinian approaches, which must take these forms through some intermediate bridge-head, unnecessarily in our view). (7b) Sanskrit /Xa/v,imp.II,sing bhava you be
«
/Xa:mi/v,pres,I,sg bhava:mi I am
(Notice that there is no need to go through the famous ërootí /bhu/ to capture this part of what is involved here. Nor is there any need to appeal to this fictitious construct to capture other such relationships.)
72 ! Rajendra Singh and Alan Ford
(7c) Sanskrit /X/n,dat,pl ka:mebhyas love
«
/X/n,abl,pl ka:mebhyas love
(That this straightforward partial syncretism must, under Påƒinian lights, be stated in a meta-grammar of Sanskrit or made to follow from some Paradigm Structure Condition(s) is surely a reflection on those lights).
Conclusion Even at the risk of being redundant, we wish to underline the fact that the WFSs presented in Section 3 above do not appeal to or use any Påƒinian construct such as dhatu, anga, vibhakti, pratyaya etc. Nor do they use concepts such as inflection, derivation, and compounding etc. Yet these strategies say exactly what needs to be said about the bits of morphology they describe, and some of what they do say is hard, if not impossible, to say in Påƒinian terms. Whereas Påƒinian morphology sees what could be called morphological complexity as a matter of layers of morphological structure, our strategies invite one to think of words it would call complex as made up of variables and constants that have been non-hierarchically put together, provided, of course, there are strategies that licence such analyses. Thus, both English Marxism and Hindi gho:Da:ga:Di can be analyzed as made up of substrings that correspond to what is varied and what is held constant in the relevant strategies (they are, it is important to underline, identified as such ONLY in the strategies. If these strategies are in fact invoked to create these words, they will not, we want to emphasize, supply any boundaries or brackets, only seamless wholes (Marxism and gho:Da:ga:Di) that will show up as words after phonological processeses have given them the phonetic shape they must have to count as words. The need to divide non-category bearing substrings into roots and stems, etc., or what is held constant and can be seen as strings into prefixes, suffixes, and infixes, etc., remains a mystery to us. As for the word, it is clearly indicated by a bald, unadorned and unsupported X, whose ability to bear a category does not depend on the presence of some other supporting material or by an X AND the bound material whose support it needs before it can take on the burden of bearing a category (cf. Hindi gho:Da: as an instance of the left-hand pole of [6c]
In Praise of ›aka¢åyana ! 73
and Hindi /laDak/ as an instance of the variable in the left or right hand pole of [6a]). And it is, of course, forever nitya. What is held constant but cannot be neatly localized as a ëmorphemeí is, of course, the Achillesí heel of morphemology. Although the different types of substrings in deconcatenatable representations can be easily, perhaps even trivially, identified, there is no reason to give them any status or special names, except perhaps for heuristic and pedagogical reasons, and even then a caveat lektor is needed. As for morphological typology, it is perhaps only a matter of the types of Xs that dominate particular morphologies. Thus, it is possible to refer to a morphological system or a part of a morphological system in which only bald, unadorned Xs bear categories as wordbased and to systems or subsystems characterized by the absence of such bald Xs as non-word based. This naming device does not, however, require giving up the assumption that morphology relates whole words with whole words, obviously contra Påƒini.
Postscript Is it too arrogant to suggest that perhaps the only things in The Astadhyåyi that seem sustainable are its rejection of the putative distinction inflection/derivation6 and of so-called conjugational and declensional classes? The former (inflection vs derivation) is a result of confusing form with function (cf. Singh and Ford 1979). As for the latter, we have, hopefully, shown that it is indeed possible to do morphology without declensional and conjugational classes without paying the heavy price the celebrated Påƒinian invocation of ëinternal sandhií here seems to demand and exact.
Notes 1. Reproduced from Singh (2000), The Yearbook of South Asian Language 2000, Sage. We are grateful to Probal Dasgupta, Wolfgang Dressler, Ashok Kelkar, David Stampe, and Stanley Starosta for convincing us that it was better to show precisely how what we say can be said is in fact said with the ëminimalistí formalism we propose for
74 ! Rajendra Singh and Alan Ford
2.
3.
4.
5.
6.
morphology. We are also grateful to Sylvain Neuvel for drawing our attention to certain matters of exposition. ›aka¢åyana, an honoured name mentioned by Påƒini himself, is known to have argued that affixes do not have any meanings (outside the words they appear in). Although his work has not survived, we speculate that he must have argued for what we can call wholeword morphology, the view from which the non-autonomy of affixes would naturally follow. We argue for that non-autonomy in Ford and Singh 1999. Although, a full outline of the theory in question is provided only in Ford and Singh (1991), implicit and explicit suggestions regarding its shape and claims are available in papers written as early as the early 1980s (cf. Ford and Singh 1983 and 1984). The word is a quantum of information whose particle properties are made reference to by phonology and morphology while its syntax and semantics make its wave properties explicit. The point of saying it this way is to make it clear that so-called compounds are single words and DO NOT contain two or more words (cf. Singh and Dasgupta). As most contemporary versions of Påƒinian morphology do not reject this distinction, we refer here specifically to The A¶¢adhyåyi and NOT to Påƒinian morphology (in general).
References Agnihotri, Rama Kant. 1997. ëIs ghuDsavar a compound?í Paper presented at South Asian Language Analysis XVIII Round-table, January. New Delhi: Jawaharlal Nehru University. Anderson, S.A. 1992. A-morphous Morphology. Cambridge: Cambridge University Press. Dasgupta, P. 1989. Projective Syntax: Theory and Applications. Pune: Deccan College. Deshpande, M. 1997. Building blocks or useful fictions: Changing views of morphology in Ancient Indian Grammatical Thought. In India and Beyond, Aspects of Literature, Meaning, Ritual and Thought: Essays in Honour of Frits Staal, ed. by Dick van der Meij, pp. 71ñ127. London: Kegan Paul International (in association with the Institute for South Asian Studies, Leiden). Ford, A. and R. Singh. 1983. ëOn the status of morphophonologyí. In Papers from the Parasession on the Interplay of Phonology and Morphology, ed. by J. Richardson et al., pp. 79ñ95. Chicago: CLS.
In Praise of ›aka¢åyana ! 75 Ford, A. and R. Singh. 1984. ëRemark on the directionality of word-formation rulesí. In The Proceedings of the Eastern States Conference on Linguistics (ESCOL), ed. by Columbus, pp. 205ñ13. Ohio: Ohio State University. óóó. 1991. ëPropedeutique morphologiqueí. Folia Linguistica. 25(3ñ4): 549ñ75. óóó. 1996a. ëReply to Mohanan and Jandaí. In Trubetzkoyís Orphan, ed. by R. Singh, pp. 166ñ70. Amsterdam: Benjamins. óóó. 1996b. ëQuelques avantages díune linguistique débarrassée de la morpho(pho)nologieí. In Trubetzkoyís Orphan, ed. by R. Singh, pp. 119ñ 39. Amsterdam: Benjamins. óóó.1999. ëWhy we donít need a second lexiconí. In Linguistics Today. 3(1): 1ñ25. Ford, A., R. Singh and Gita Martohardjono. 1997. Pace Påƒini: Towards a Word-based Theory of Morphology. New York: Peter Lang. Janda, R. 1983. Morphemes arenít something that grows on trees: Morphology as more the phonology than the syntax of wordsí. In Papers from the Parasession On the Interplay of Phonolgy, Morphology, and Syntax, ed. by J. Richardson, et al. pp. 79ñ95. Chicago: CLS. Kelkar, A.R. 1999. ëWhat has Bhartrihari got to say on language?í In The Yearbook of South Asian Languages and Linguistics 1999, ed. by Rajendra Singh. New Delhi: Sage Publications. Matthews, P.H. 1974. Morphology: An Introduction to the Theory of Word-structure. Cambridge: Cambridge University Press. Philip, M. 1997. ëCompounding in Khasií. In Languages of Tribal and Indigenous Peoples of India, ed. by Anvita Abbi, pp. 361ñ72. Delhi: Motilal Banarsidass. Robins, R.H. 1959. ëIn defence of WPí. Transactions of the Philological Society. 1959: 116ñ44. Sadock, J. 1995. ëReview of S.A. Anderson, A-morphous Morphologyí. Natural Language and Linguistic Theory. 13(2): 327ñ41. Singh, Prem. 1992. ëRe-thinking history of linguistics: Saussure and the Indic connectioní. In Language and Text: Studies in Honour of Ashok R. Kelkar, ed. by R.N. Srivastava, pp. 43ñ50. New Delhi: Kalinga. Singh, R. 1998. ëIndiaís immanent critique of Påƒini and its descriptive implicationsí. Plenary lecture, Nineteenth South Asian Languages Analysis Round Table, 18 July, York, England: York University. Singh, R. and A. Ford. 1979. ëFlexion, derivation et Påƒinií. In Amsterdam Studies in the History of Linguistics, ed. by Konrad Koerner. Amsterdam: Benjamins. Singh, R. and P. Dasgupta. 1999. ëOn so-called compoundsí. In The Yearbook of South Asian Languages and Linguistic 1999, ed. by Rajendra Singh. New Delhi: Sage Publicatons. Singh, R. and R. K. Agrihotri. 1997. Modern Hindi Morphology. Delhi: Motilal Banarsidass.
76 ! Rajendra Singh and Alan Ford Starosta, Stanley 1999. ëHistorical reconstruction without morphemesí. Paper presented at the Poznan Linguistics Meeting, 1 May. Poland: Poznan. Thieme, Paul. 1971. Kleine Schriften. Wiesbaden: Franz Steinev. Tirumalesh, K.V. 1997. ëKannada prefixationí. In Phases and Interfaces of Morphology, ed. by M. Hariprasad, H. Nagarajan, P. Madhavan, and K.G. Vijaykrishnan, pp. 74ñ84, Hyderabad: The Central Institute of English and Foreign Languages. Vajpeyi, A. 1997. ëContemporary linguistic theorizing and Sanskrití. Paper presented at South Asian Language Analysis XVIII Roundtable, January. New Delhi: Jawaharlal Nehru University. Wali, Kashi and O. Koul. 1997. Kashmiri. London: Routledge.
On So-called Compounds ! 77
4 On So-called Compounds1 Rajendra Singh and Probal Dasgupta The purpose of this note is (1) to flesh out, with the help of some facts pertaining mostly to Bangla and Hindi, the somewhat cryptic remarks Ford and Singh (1996a: 168) and Ford, Singh and Martohardjono (1997: 58ff) make regarding what are normally called compound words, and (2) to invite our readers to tell us why we should not reject, lock, stock and barrel, the standard characterization of compounds as a subclass of words made of two or more words (cf. Anderson 1992, Bloch and Trager 1942, Bloomfield 1933, Lieber 1981, Selkirk 1982, Spencer 1991 and Williams 1981, amongst several others, virtually everyone, even Jespersen 1949; for a notable exception, see Starosta in this volume and Starosta et al. 1997). The uncritical acceptance of the characterization in question, unfortunately, seems to lead researchers not to even notice the sort of contradiction one finds in, for example, Vijaykrishnan (1994): although he notes, for example, that in the ëcompoundsí he is concerned with ëthe constituent (inner) V of the verb compound is not accessible in the syntaxí (ibid., p. 266), he sees no difficulty in postulating or operating with structures like [[X]n [Y]v]v. It seems necessary to issue such an invitation because those who reject word-internal syntactic compositionality are often asked, ëbut what about compounds?í as if they had never heard of them. The appropriate response to the question, ëwhat about compoundsí, weíd like to suggest, is: ëwhat about them?í We choose Bangla and Hindi endocentric ënoun-noun compoundsí to make our case because ëcompoundsí like these are often presented as providing particularly telling evidence for ëthe internal structure of wordsí (cf. Anderson).2 Ford and Singh reject that characterization because their theory of morphology, which we shall, following a suggestion of Starostaís (personal communication), refer to as Seamless Morphology, does
78 ! Rajendra Singh and Probal Dasgupta
not allow them to entertain different kinds of morphologies for what are seen as different subclasses of words. They insist on a unified morphology without any internal structure beyond word internal phonological and semantic structure and claim that words standardly seen as morphologically complex are simply words that can be deconcatenated into two substrings, one representing a variable and the other representing a constant, with the help of some Word Formation Strategy, though the representations of these words do not contain any indication of the parts they are sometimes said to be made up of. The English word banality can, thus, be said to contain two substrings, banal and ity, and the word doghouse can be said to contain the substrings dog and house. In the former case, the substring banal takes, to anticipate a bit (cf. 3 below), the place of the variable X in the WFS: /X/Adj. « /Xiti/Noun and the substring ity is the constant already specified in the relevant WFS. There is, they claim, no significant difference between the ity in banality and the house in doghouse: they are both constants. The Påƒinian attempt to situate its various constructsóroot, stem, affix, augmentative, etc.ówithin a structured hierarchy, as we shall see, seems unable to cope with the facts of the matter. The fact that the house in doghouse looks like the word house is only an appearance, as Bhartrihari (cf. Iyer 1977) would have said. It certainly falls short of the commutative possibilities of dog, which, as a momentís reflection will show, can be replaced by any number of words. Grammaticalization flies in the face of the common belief, not to say illusion, that in expressions such as doghouse, either of the two elements can be freely replaced. When this belief is tuned into a hypothesis, as in Dowty (1979), it turns out to put too heavy a burden on pragmatics: what, for example, is the PRAGMATIC reason which allows terminvorschlag and terminvereinbarung in German but blocks *date proposal and *date agreement in English? Facts of ëcompoundingí from various languages indicate quite clearly that ëcompoundsí come in sets, each set anchored in some specific constant. There are, in other words, exceptions to all of Leviís (1978) nine classes, but only because she has only nine. What is needed is the identification of as many sets as there are constants. The rather large number of rules Fanselow (1981) rightly postulates for German compounds should, therefore, come as no surprise. If, as Ford and Singh maintain, ëcompoundsí are locally anchored in specific constants, representational analogues of the popular
On So-called Compounds ! 79
bi-motic view of them must be rejected. They cannot, in other words, be assigned structures such as (1), where x and y represent categories: (1) y
x x
y
x
y
The problem is that there is no justification for assigning any category to any of the terminal nodes: a two element ëcompoundí is ONE word and NOT three words. Consider the following paradigm from Modern Hindi: (2) ghODAgADi UntTgADi bElgADi bhEnsAgADi
ëhorse-carriageí camel-carriage ox-carriage water-buffalo carriage
There is a rather large number of nouns, and it can be extended, that can occupy the place of the first element, the variable in the examples in (2), not surprisingly because that is what ëthe place of the variableí means; equally unsurprisingly, there is not much that can take the place of gADi, the constant. Notice that facts of this kind find a straightforward expression in the sort of morphological theory we have alluded to. Briefly, it expresses all word-formation processes with WFSs that have the following shape: (3) /X/a « /X¢¢/b where X and X¢are words, a and b are morphological categories (feature bundles), « indicates a bidirectional relationship, and X¢ is a semantic function of X, and ¢ represents the form-related difference(s) between X and X¢. The ¢ represents the constant involved in the morphological operation and everything else in the string is referred to as a variable. Given the ëminimalistí apparatus above, the facts of (2) are expressed as follows: (4) /X/N « /XgADi/N3 Notice that (4) is not, as it indeed cannot be, different from (5) below, which expresses the generalization involved in what Påƒinians would call the suffixation of -ity (in words like banality):
80 ! Rajendra Singh and Probal Dasgupta
(5) /X/Adj. « /XIti/N The idea is not to claim that -ity is a word of English, something morphologists like Lieber (1981 and 1992) come dangerously close to claiming, or that gADi is not a word of Hindi but to claim that neither ghODa nor gADi in the examples in (2) is a word of Hindi: neither can be accessed for any grammatical purpose, and gADi is in addition no longer commutable. By virtue of the fact that it is now a part of XgADi, where it is no longer commutable, gADi has not only lost its syntactic quality of ëwordhoodí but may also be poised at the brink of losing its semantic independence. Notice that whilst the classification of both ghODa and gADi as stems may appear to cope with the ëcompounding factsí of languages that have rich morphologies and must strip their words of what some call ëinflectioní before putting them in their ëcompoundsí, it does little to answer the question: ëWhy are there these commutability differences between the two stems?í, a strong indication, we believe, of the dispensability of constructs such as ëstemí. As we said before, only two sorts of things are needed for complex words: variables and constants.4 The fact that both ërootsí and ëstemsí can function both as variables (cf. nomin- in nominate and dog in doghouse) and as constants (-ceive as in English receive and gaDi in Hindi ghoDagaDi) shows that the Påƒinian atoms of morphology donít quite stack up the way they should. The pasting canít work because the cutting seems fundamentally flawed. Be that as it may. We now turn to the fact that the semantics of ëcompoundsí requires what, following Quine (1960), could be called pada-categorematicity: the ëmeaningsí of the elements that go into the making of ëcompoundsí are word-held. Used as a word in a Bangla sentence the sound sequence /pOtro/ means ëletter or documentí in some contexts and ëleafí in others. However, when the sequence /pOtro/ occurs in the compounds (6añe), it contributes to these words only the ëletterí meaning: (6) a. b. c. d. e.
prempOtro proSnopOtro niyogpOtro uttorpOtro pOdottEgpOtro
ëlove letterí ëquestion paperí ëappointment letterí ëanswer scriptí ëresignation letterí
In contrast, when it occurs in the compounds (7añe),/pOtro/ contributes only the ëleafí meaning to these words:
On So-called Compounds ! 81
(7) a. b. c. d. e.
billopOtro nimbopOtro amropOtro SOptopornipOtro pOddopOtro
ëwood-apple leafí ëmargosa leafí ëmango leafí ëchatim leafí ëlotus leafí
Even under conditions that should enable a pragmatic override, a boy giving a girl a leaf to express unusual love simply will not call it a /prempOtro/ ëlove leafí, and a document eulogizing a lotus-flagged political party simply cannot be called a /pOddopOtro/ ëlotus documentí. No such pragmatic overrides exist. The full word /pOtro/ does retain its full range of meaning options when used in a true syntactic construction. And the word-fragment /pOtro/ in a ëcompoundí does get semantically constricted. The language compels it to contribute the ëletterí meaning to the ëcompoundsí (6añe) and the ëleafí meaning to (7añe). This compulsion can be expressed formally. We might propose the following WFSs, details negotiable, for these consistent patterns: (8) WFS for (6añe): /X/N, act/abstr « /XpOtro/N, act/abstr (9) WFS for (7añe): /X/N, botanical « /XpOtro/N, botanical,
leaf
Beyond these active patterns lie two isolates: (10) tamropOtro (11) bhurjopOtro
ëcopper plateí ëbirch barkí
The ëcompoundí words (10) and (11) are based on /tamro/ ëcopperí, which is still in use as a word, and on /bhurjo/ ëbirchí, which has become obsolete in that role. What is of some interest is that the sequence /pOtro/ can contribute a ësheet of metalí meaning nowhere but in the hapax (10). And it can contribute a ëbarkí meaning only in the hapax 11. In no syntactic context in Bangla does the independent word /pOtro/ ever mean ë(metal) plate/sheetí or ëbarkí: (12) *tamrer ekTi pOtro for ëa plate/sheet of copperí (13) **bhurjobrikkher kichu pOtro for ësome bark of a birchtreeí (double ** because /bhurjo/ is obsolete in all contexts other than 11).
82 ! Rajendra Singh and Probal Dasgupta
We note that current syntactic thinking holds that all of us need to become accountable to morphology. We submit that this will involve facing at least the facts observable at (8ñ13). A grammar which allows (8ñ11) to appear to contain a true word /pOtro/ makes it impossible to even state the facts coherently. It follows that the null hypothesis should give a ëcompoundí word no internal grammatical structure (= structure beyond what may be required by phonology)óshould not allow it to contain two or more words. The meaning of a ëcompoundí word may well be a structured object, possibly even a tree-using one. A proper unpacking of (8) with this question in mind may have to state that an /XpOtro/ is a letter/document embodying an action/abstraction, giving enough elbow room to say ëembodyingí. We do not oppose such a semantic analysis on grammatical grounds. The history of a ëcompoundí word certainly may lead us to refer to word-internal entities. We may wish to describe the /pOtro/ part of (10) and (11) as related to some older independent word capable of such semantics. The /bhurjo/ part of (11) manifestly comes from an older independent word, now obsolete. We do not reject such a diachronic analysis for synchronic reasons. To a historian, then, (11) may appear to continue the history of the word /bhurjo/, into loss of independent life, and of the word /pOtro/, into loss of one of its meanings when used independently. To the semantic eye, (6a) may look like a combination of the ëloveí and ëletterí meanings, structured by the WFS (8) once this is rewritten in semantically careful terms (a task one gladly leaves to specialists). We have no wish to deny these pieces of semantics and history. Our point, on the contrary, is that a synchronic grammar which tries to assimilate the supposed ëwork of assembling fragments into wordsí to the actual work of putting words freely together into phrases is out on a limb that cannot bear the weight of stating the historical and semantic facts with accuracy and parsimony. We argue that the least that synchronic grammarians can do is refrain from imposing internal grammatical structure on ëcompoundí wordsóor on any other words. In other words, we argue that a grammar that seeks compatibility with the ineluctably, minimally needed historical and semantic statements must at least adopt the following proposals of Seamless Morphology, even if some readers disagree with us on issues of theory or formalization:
On So-called Compounds ! 83
(14) a. A word has no internal grammatical (= non-phonological) structure b. There is only one morphology, only one coherent set of regular inter-word mappings, in the grammar. At point (14a), we reject the option of treating a ëcompoundí as a syntactic construction, and at (14b), we hold that a ëcompoundí word must be described in terms of particular one-variable WFSs like (8) and (9), never a two-variable, like WFS (15): (15)
/X/ + /Y/ « properties A properties B
/XY/ properties A+B
If, contrary to (14a), ëcompoundí words were syntactically assembled, then their assembly would in principle be a free process yielding a combinatorics of the constituents that take part. It would then be surprising, rather than the norm, that elbow room means only ëenough space for free movementí and does not freely take on interpretations like ëroom for innovative dance involving linked elbowsí; that bathroom only means ëroom with bath and/or toilet facilitiesí and never ëenough space to have a bath iní; or that bedroom can only mean ëroom to sleep iní and not ëroom where someone keeps lots of beds stashed awayí. In other words, the notion that ëcompoundingí is a syntactic process leads to obviously and massively false expectations for any language one cares to look at. We are mildly surprised that anyone can take it seriously. We turn to (14b), our rejection of multiple morphologies, or our rejection of a separate WFS type such as (15) that might uniquely describe ëcompoundsí. At first sight this seems to express at least the basic observations. For (15) does indicate how a ëcompoundí is like a helium atom, which does not contain two hydrogen atoms. We learn from (15) that a ëcompoundí word does not contain two independent words, but provides a structure. To this structure, two ëpseudo-wordsí contribute sound and meaning. This arrangement lets these dependent subwords liaise with independent correlates where available, but through morphological operations, thus allowing for opacities and leakages. Why do we reject even this? Our conceptual resistance to (15) comes from a desire to maintain, if evidence permits, the stronger hypothesis that only singlevariable WFSs exist. Our empirical resistance comes from data of the sort given at (6ñ11). There we see that the facts relevant to the statement of ëcompoundí patterns are, first of all, organized very locally,
84 ! Rajendra Singh and Probal Dasgupta
as clumps of material: consider the hapax nature of (10) and (11) (cf. the unavailability at [12] and [13] of these meanings for the true word /pOtro/) and the clumping in (6añe) and (7añe). Such clumping is unexpected under (15). Second, these very local forms of organization are formally monovariate, as we see clearly at (6) and (7). The ëdocumentí constant takes an (abstr)action N variable in (8). In contrast, the ëleafí constant takes a botanical N variable in (9). It is empirically not the case that the two slots in (6), say, permit free substitution.5 Consider replacing /pOtro/ ëletterí with expressions for ëagentí in (6añe): (16) a. b. c. d. e.
premik proSnokOrta niyogkari uttordata pOdottEgkari
ëloverí ëquestioner, question paper setterí ëappointerí, ëemployerí ëanswer-giverí, ërespondentí ëresignerí, ëperson resigningí
There are two clear ëcompoundsí here, (16b), (d), as /kOrta/ ëagentí and /data/ ëgiverí are available as clearly independent words. It is less clear if /kari/ in (16c) is a defective word or an affix. And the /ik/ of (16a) is an affix on any account. Our point is that you donít get the forms */premkOrta, niyogkOrta, uttorkOrta, pOdottEgkOrta/ with the freedom that the word syntax account would call for, and you donít even get half of them as the bivariate account (15) would minimally call for. (6añe) seems to be the only set where these five ëfirst members of a compoundí pose together for a group photo. We conclude that the photograph is taken by a single constant, the /pOtro/ ëdocumentí of (8). Thus, yet another time, we fail to find evidence for a clear case of a bivariate of the (15) type. We are appealing to linguists of various persuasions, then, to take the phenomenon of semantic reshaping seriously. If one wishes to see a ëcompoundí word as relatable to constituents available as independent words in the language, we argue, one must also notice that the semantic-phonetic correlations which those independent words embody are reshaped when they get integrated in this fashionóand we propose that, each time, there is an integrator, the constant, and an integree, in the variable slot. Let us take a case where many of us might think we see a symmetric pair of integree-integrators. Why, we must ask ourselves, do we not regard even (17) as evidence for (15)?
On So-called Compounds ! 85
(17) b. c. d. e.
batighOr kOlghOr rannaghOr bhaMRarghOr
añi. DakghOr ëpost officeí ëlighthouseí ii. Dakchap ëpostmarkí ëbathroomí iii. Dakpion ëpostmaní ëkitchení iv. DakbakSo ëpostboxí ëstore roomí v. DakTikiT ëpostage stampí
It is clear that /DakghOr/ ëpost officeí in (17) appears at an intersection of the /ghOr/ set (17añe) and the /Dak/ set (17iñv). We reject the view that (17) supports (15) because the expected compounds that one might code as (17bñii) */bati-chap/, (17cñii) */kOl-chap/, (10bñiii) */bati-pion/ etc.óall 16 of themóare uniformly starred. The example goes to show that there are only monovariate ëcompoundsí. An intersection word like /DakghOr/ is simply a place where the WFS responsible for (10añe), postulating [XghOr], and the one that handles (17iñv), which visualizes [DakX], de-concatenate the ëcompoundí in two different ways given in (17í) below, and thus insert it into two series, which meet at only this and no other point: (17í) a. DakghOr óó X ghOr ëwhich space? post officeí
i. DakghOr óó Dak X ëpostal who? post officeí
That the word has two monovariate parses allowing each side to play integrator in turn does not mean that it has a bivariate parse with two integrator-integrees. If clear evidence for (15) does surface somewhere, it will be important for all of us to face it from our various standpoints. Notice that our demonstration has carefully avoided what in the Indian tradition are called vikari ëcompoundsí. These ëcompoundsí, we believe, provide particularly telling evidence regarding the diachronic demotion and subordination of one or more elements involved in compounding. Here are some examples from Hindi (for many, many others, see Jain 1964 and for a discussion of some of them, see Agnihotri 1997): (18) a. ghuDsavAr horse rider
86 ! Rajendra Singh and Probal Dasgupta
b. ghuDdaur horse race We have avoided these because it is quite clear that the constant in these cases (ghuD in the examples above) canít even create the illusion of being the word it is generally alleged to be SYNCHRONICALLY derived from. 6 We have instead concentrated on cases that can create that illusion, and have led many a grammarian to sustain it. We would also like to point out that the theory sketched out above provides a fairly straightforward account of the sorts of facts that persuade Anderson to give up. Although it is not our intention to show why Andersonís attempts to flirt with a genuinely a-morphous morphology cannot, unfortunately, be taken seriously, contemporary rules of the game require that we provide possible analyses for what even heóeven because he is supposed to be an advocate of the sort of morphology weó(cf. the Ford and Singh references cited in Dasgupta 1995 and Ford and Singh 1996b) and some others (cf. Starosta 1997 and 1988, for example) have argued for. There is, pace Anderson, nothing mysterious about English expressions like Afro-American or the use of s and en inside some German ëcompoundsí. Despite the fact that Anderson treats them as two different kinds of things, both patterns exhibit vikar ëdistortioní accompanying ëcompoundingí. Afro is, to use slightly illegal language, simply the ëprefixí attached by the following rule in English: (19) /X/adj. « /æfroX/adj. The /o/ in afro is as much morphological glue as the s or en in German, and there is nothing threatening, as Anderson thinks, about it. It is simply the imprint, or footprint, of the morphological operation: it is an inseparable part of that operation. As for expressions like English sons-in-law and brothers-in-law, Anderson gives up too soon again. Thereís no reason why inlaw cannot be analyzed as a constant or, to use a Påƒinian term, a pratyay (a suffix in this case). The strategy for words like this is actually quite straightforward: (20) /X/N « /XInl /N The reader can easily verify for herself that none of the other cases seen as problematic by Anderson for an a-morphous morphology
On So-called Compounds ! 87
actually present any problems for the theory of morphology under considerationósome of them actually provide pretty strong evidence for it. That they do for what HE calls a-morphous morphology is, obviously, irrelevant. Were it not the case that desired categories would be hard to guarantee, one COULD almost say that the behaviour of ëcompoundsí par excellence is such that one should anchor them even more in the specific ëitemí around which they are built. To highlight the fact that the sequence gaDi in Hindi is both a word and a constant and the fact that there must have been a diachronic rule that conferred a double membership on itóbrought the ëfree formí to belong to the class of constants alsoóone might be tempted to go as far as to say that strategy (4) above should be rewritten as (21), which, though obviously wrong, dramatically captures the fact that one of the ëfree formsí loses its freedom when it is functioning as a constant: (21) /gADi/N « /XgADi/N We have argued that it is a mistake to ignore grammaticalization and to postulate à la Påƒini 2.1.1 (cf. Katre 1987) or Dowty general rules of ëcompoundingí. Words of the sort traditional definitions of ëcompoundingí talk about donít seem to exist. We have also attempted to show that what appears to be the most exploited type of ëcompoundingíóëendocentric noun-noun compoundingíóis constrained in ways that appear unusual under traditional Påƒinian lights. Perhaps it is time to turn those lights off. To conclude, the journey from Mahabhasya 1. 364 (cf. Kielhorn 1885) to Vakyapadiya III. 14. 30ñ75 (cf. Iyer 1977) was a long and arduous one, and it is a pity that the insights gathered at the end of it seem to have been buried rather deep. It is time to rediscover them (for a modest attempt to do so, see Singh 1998).
Notes 1. This note is a revised version of a paper which was presented at the GLOW colloquium held at CIEFL, Hyderabad and which first appeared in R. Singh ed., 1999. The Yearbook of South Asian Languages and Linguistics 1999. New Delhi: Sage Publications, 20ñ22 January 1998. We are grateful to Alan Ford, Yves-Charles Morin, K.P.
88 ! Rajendra Singh and Probal Dasgupta
2. 3.
4.
5.
6.
Mohanan, and Stanley Starosta for numerous discussions regarding ëcompoundsí. For an illuminating analysis, somewhat along the lines followed here, of Chinese ëcompoundsí, see Starosta et al. (1997). Although the phonemic bars are supposed to enclose phonemic transcriptions, we use transliteration traditionally used for IndoAryan languages for ease of exposition. For non-I-A languages such as English, we do provide the required phonemic transcription. The terms ëvariableí and ëconstantí, it is important to point out, cannot be used to adduce any structure. They only identify parts of a string. As what appears as a constant in one word may appear as a variable in another, the issue is not a terminological one. We are claiming, in effect, that new ëcompoundsí produced in spontaneous discourse (= ëonlineí) will invariably contain an element that is known to function as a constant. Its ëdistorted formí ghuD keeps it from creating the illusion in question. As for ëcompoundsí like Hindi chavanni ëquarterí (quarter of a rupee), the less said the betterófour AnAs do not add up to a chavanni, which used to be a coin. The affixal status of ëunreducedí elements from Sanskrit like Atma ëselfí and pAl ëone who protectsí is equally clear (cf. Balbir 1996: 54).
References Agnihotri, Rama Kant. 1997. ëIs ghuDsavar a compound?í Paper presented at South Asian Language Analysis (SALA), XVIII Roundtable. January 1997. New Delhi: Jawaharlal Nehru University. . Anderson, Stephen A. 1992. A-morphous Morphology. Cambridge: Cambridge University Press. Balbir, Nicholas. 1996. ëThe Modernization of Hindií. Perspectives on Language in Society, ed. by Shivendra K. Verma and Dalip Singh, pp. 36ñ60. New Delhi: Kalinga Publications. Bloch, Bernard and George Trager. 1942. Outline of Linguistic Analysis. Baltimore: Linguistic Society of America. Bloomfield, Leonard. 1933. Language. New York: Holt, Reinhart. Dasgupta, Probal. 1995. ëThe importance of being ernistí. Linguistic Analysis. 25: 121ñ36. Dowty, David. 1979. Word Meaning and Montague Grammar. Dordrecht: Reidel. Fanselow, Gisbert. 1981. Zur Syntax und Semantik der Nominalkomposition. Tübingen: Niemeyer.
On So-called Compounds ! 89 Ford, Alan and Rajendra Singh. 1996a. ëReply to Mohanan and Jandaí. Trubetzkoyís Orphan, ed. by Rajendra Singh, pp. 166ñ70. Amsterdam: Benjamins. óóó. 1996b. ëQuelques avantages díune linguistique débarrassée de la morpho(pho)nologieí. Trubetzkoyís Orphan, ed. by Rajendra Singh, pp. 119ñ39. Amsterdam: Benjamins. Ford, Alan, Rajendra Singh and Gita Martohardjono. 1997. Pace Påƒini: Towards a word-based Theory of Morphology. New York: Peter Lang. Iyer, K.A. Subramania. 1977. The Vakyapadiya of Bhartæhari. Delhi: Motilal Banarsidass. Jain, Ramesh C. 1964. Hindi Samas Rachana ka Adhyayan. Agra: Vinod Pustak Mandir. Jespersen, Otto. 1949. A Modern English Grammar. Kobenhavn: Ejnar Munksgaard. Katre, Sumitra M. 1987. A¶¢ådhyåy∂ of Påƒini. Delhi: Motilal Banarsidass. Kielhorn, F. 1885. The vyåkaraƒa-mahåbhå¶ya of Patanjali. Bombay: Directorate of Language. Levi, Judith. 1977. The Syntax and Semantics of Complex Nominals. New York: Academic Press. Lieber, Rochelle. 1981. On the Organization of the Lexicon. Bloomington: Indiana University Linguistic Club. óóó. 1992. Deconstructing Morphology. Chicago: The University of Chicago Press. Quine, Willard van Orman. 1960. Word and Object. Cambridge, Mass.: Technology Press of MIT. Selkirk, Elizabeth. 1982. The Syntax of Words. Cambridge, Mass.: MIT Press. Singh, R. 1998. ëIndiaís immanent critique of Påƒini and its descriptive implicationsí. Plenary lecture, SALA, July 18. York, England: York University. Spencer, Andrew. 1991. Morphological Theory. Oxford: Basil Blackwell. Starosta, Stanley. 1988. The Case for Lexicase. London: Pinter Publishers. Starosta, Stanley, Koenraad Kuiper, Zhiqian Wu, and Siew Ai Ng. 1997. ëOn defining the Chinese compound word: Headedness in Chinese compounding and Chinese VR compoundsí. In New approaches to Chinese word formation, ed. by J. Packard, pp. 347ñ70. Berlin: Mouton de Gruyter. Vijaykrishnan, K.G. 1994. ëCompound typology in Tamilí. In Word Order in South Asian Languages, ed. by Miriam Butt et al., pp. 263ñ78. Stanford: CSLI. Williams, Edwin. 1981. ëOn the notions ìlexically relatedî and ìhead of a wordîí. Linguistic Inquiry. 12: 245ñ74.
90 ! Stanley Starosta et al.
5 On Defining the Chinese Compound Word: Headedness in Chinese Compounding and Chinese VR Compounds1 Stanley Starosta, Koenraad Kuiper, Siew Ai Ng and Zhiqian Wu Introduction The question of what constitutes a compound word in Chinese languages can be approached in two ways. The first is to adopt the traditional Chinese philological definition, which supposes a compound word (in the great majority of cases) to be a word made up of two characters just as the traditional Chinese philological tradition is to suppose that an idiom consists of four characters. The second is to take the hypothesis of contemporary linguistics, which is to suppose that a compound word is a word consisting of two or more words. We take it that Chinese languages are natural languages subject to the same constraints as other natural languages, and to be analyzed in terms of the same procedures and terminology used by linguists in the analysis of other languages. It is our contention that although the traditional Chinese definition of a compound word is used by many scholars working on Chinese word-formation, the definition makes it difficult or impossible in some cases to give significant explanations of the facts of Chinese word-formation because the definition obscures the distinction between various kinds of morphemes on the one hand and words on the other. We shall illustrate this contention by looking at two phenomena: headedness in Chinese compounds and so called VR or resultative compounds in Mandarin. In both cases we will compare analyses which take the traditional Chinese view of compounds and views
On Defining the Chinese Compound Word ! 91
which utilize contemporary linguistic theories to see which definition of compounds provides the more compelling explanation of the phenomena. On the basis of our findings we will suggest that in the study of Chinese word-formation care needs to be taken with the extent to which linguists rely on traditional analyses. Why do this? In physics there are a number of abstract terms which have been developed over the last couple of millennia to account for the nature of the physical universe. They include, in the field of mechanics, terms such as ëmassí, ëvelocityí and ëaccelerationí. The definition of such terms has been a matter of evolution within physics and the terms do not now mean what they did even a 100 years ago. Their evolution has been the result of evolving theories about the physical universe and, no doubt, when physical theories again change, the terminology will change. This happens when better explanations result from newer theories. It is therefore, necessary in linguistics no less than in physics to look constantly at terminology (and the theories which give rise to it) to assess whether that terminology arises from theories which provide the best explanation of linguistic phenomena. On a realist interpretation of scientific theory, each change in the meaning of the theoretical constructs of a science genuinely changes what is known about the phenomena themselves. It is our contention that traditional Chinese terminology in the field of word-formation is militating against better, that is more universal, explanations of the phenomena of Chinese word-formation. It also makes for confusion in the way linguists communicate their findings to each other. Western linguists may believe that linguists who are working on Chinese languages are using the terms ëwordí, ëcompound wordí and ëmorphemeí with the same meaning that they have in western linguistics. But if this is not so, significant misunderstandings are likely to result. Of course people are at liberty to define terms in any way they wish. The advantage of using traditional Chinese definitions of the linguistic properties of Chinese is that Chinese languages have the capacity thereby to appear to be very different from other languages. For example, it is often said that Chinese is unusual in that a single word is in many ways ambiguous, with one basic sense but many subsidiary ones. This is not surprising. A character is a symbol in a writing system which has evolved over a very long period. There are a large number of characters and in the process of memorizing them, students are taught mnemonics to recall them, mnemonics which
92 ! Stanley Starosta et al.
depend on the recognition of particular patterns and the association of each of these with one meaning which is then taken to be fundamental. Viewed from a western linguistic vantage point, characters may be many ways ambiguous, both syntactically and semantically, because they can represent more than one word, i.e., they are potentially homographs. For a western linguist this is no surprise since homographs are relatively common even in languages with a phonemebased writing system. Given the impossibility of learning to write in a language where every sense of every word in the language is represented by a unique ideograph, homographs must be very common in languages which use ideographic writing systems. The disadvantage of the Chinese view is therefore that it has the potential to obscure the properties of Chinese languages.
What is a Compound Word? Structuralist theories in western linguistics do not always give clear answers to the question of what constitutes a compound word. Bloomfield (1933: 227) states that ëcompound words have two (or more) free forms among their immediate constituentsí. But Bloomfield also states that, ëThe gradations between word and phrase may be many: often enough no rigid distinction can be maintainedí (ibid.). The reason for this is that both compound words and phrases are concatenations of free forms. However, compound words ëexhibit some feature which, in their language, characterizes single words in contradistinction to phrasesí (ibid.). Such features include semantic non-compositionality (although Bloomfield points out that this property is also shared with idiomatic phrases), a stress pattern characteristic of words rather than phrases, sandhi characteristic of words rather than phrases, fixed word order, and grammatical features of selection. None of these, unfortunately, provides totally clear ways to distinguish compound words from phrases. But for our purposes it will be sufficient to suppose that compounds consists of two or more free forms, i.e., two constituents which are themselves able to function as words and therefore have syntactic categories. The resulting pairing must itself also be a word. For comparison let us look in more detail at how a scholar influenced by the traditions of Chinese philological scholarship deals
On Defining the Chinese Compound Word ! 93
with the question of what is a compound. According to Chao, ëA morpheme which can be uttered alone is free (F), and one which always occurs without pause with another morpheme in an utterance is bound (B). Therefore li [li2] ìpearî is free.... Therefore taur [tao2 ìpeachî] is boundí (Chao 1968: 143ñ44).2 The question then becomes, is the constituent of a word a form like li2 ëpearí or a form like tao2 ëpeachí? Only if it is a li2 form, that is, a word, can we raise the question as to its syntactic category because syntactic word classes are established on the basis of syntactic distribution. A form which appears only inside a word has no syntactic distribution or function, and therefore has no syntactic category. (This is not, of course, to say that bound forms such as suffixes cannot donate syntactic categories to the words of which they are constituents, as in the case of the English suffix -ation which donates to the words on which it is the last suffix, the category noun.) The difficulty in deciding whether a constituent of a word is itself a word is exacerbated by the traditional Chinese definition of ëwordí. It is well-known that there is no good Chinese translation for the English word word. About the closest we can come is ci2 ëAn expression. Words; phrases; a part of speech. Tales; stories. A form of poetryí (Mathews 1960: 1031, no. 6971), and zi4 ëA letter; a written character; a wordí (Mathews 1960: 1025, no. 6942). In fact, Chinese characters almost always encode monosyllabic morphemes, not words (DeFrancis 1989), but in practice, those who are influenced by traditional Chinese scholarly practice often regard a word as whatever corresponds to a single written character,3 and rarely raise the question of whether it is free or bound.4 As Charles Hockett puts it: The early western students of Chinese, and the Chinese themselves until quite recently, perceiving the language through a haze of characters, saw utterances as rows of bricks, of uniform size and shape, each a single syllable and a single ëwordí, immutable, subject to no influence (or almost none) from the preceding and following bricks. (Hockett 1950: 70) The problem with this assumption from a linguistic point of view is that while Mandarin Chinese does have a number of true monosyllabic words, the stretch of speech corresponding to a written character is very often not a minimum free form. As Chao notes (1968: 145ñ46), the linguistic unit corresponding to a character is typically
94 ! Stanley Starosta et al.
a morpheme bound to a preceding or following morpheme: ë(3) (-)B(-): Start-Free or End-Free, but Not Both. The great majority of morphemes entered in a dictionary of single characters belong to this category. All numerals are of this type, ... most measures, ... , the cardinal directions, ... , the seasons of the year, ... , monosyllabic names.í But notwithstanding this realization that many morphemes are bound, Chao allows for compound words to consist of two bound stems. Root words. All these (-)B(-) forms are bound forms, since they are bound at least at one end. It is useful to distinguish here between those bound forms which will become free by the addition of an affix and those which occur only in combination with other root morphemes. The first kind consists of roots to form primary derived words or root words. Examples are: yiitz, [yi3zi] ëchairí, wahtz [wa4zi] ësock, stockingí.... The other is a much larger class consisting of bound morphemes occurring in compounds. Examples are nan [nan2] ëman, maleí, neu [nü3] ëwoman, femaleí, lih, [li4] ëstrengthí, liueh, [lüe4] ëoutline, approximateí, yau, [yao2] ërumourí, jye, [jie2] ëto knot, to concludeí. (Chao 1968: 145ñ46) Li and Thompson (1981) adopt a very similar view of compounding. They say that there is a great deal of disagreement over the definition of compound. The reason is that, no matter what criteria one picks, there is no clear demarcation between compounds and non-compounds. (ibid.: 45) They, however, adopt the traditional Chinese definition of compounds: we may consider as compounds all polysyllabic units that have certain properties of single words and that can be analyzed into two or more meaningful elements, or morphemes, even if these morphemes cannot occur independently [i.e., as words] in modern Mandarin. (ibid.: 46) There is an alternative to adopting traditional Chinese practice and that is to allow the definition of terminology to arise from theory.
On Defining the Chinese Compound Word ! 95
Thus, what constitutes a word or a compound becomes the outcome of theories about the nature of words and compounds. We will follow this methodological procedure.
Headedness in Chinese Compounds Many generative theories of compounding (Di Sciullo and Williams 1987, Kuiper 1972, Lieber 1992 and Williams 1981a) propose that words, including compound words, have heads. The head of a word has a number of attributes; the only one which will be of significance for what follows is that heads determine the syntactic category features of the word as a whole. For example, in the compound noun bluebird, it is the fact that bird is a noun which is responsible for the fact that bluebird is also a noun. There is a small class of exceptions, the so-called exocentric compounds such as redcap, where neither constituent is responsible for heading the compound. There are other odd cases such as the word outcome which do not seem to be headed. However, in general, words are headed and they are headed in only one way. In English, for example, it is claimed, for example by Lieber (1981) and Williams (1981a) that all the compounds which are headed are right-headed, that is, the right-hand constituent determines the syntactic category of the compound. Simple compounds are binary compounds consisting of two words (Kuiper 1972).5 The structure of English simple compounds is as follows (Selkirk 1982: 14ñ15): Nouns:
NN AN PN VN Adjectives: NA AA PA VA Verbs: NV AV PV VV Prepositions: PP
millwheel, firetruck high school, poor house uprising, afterbirth scrubwoman, pickpocket heartbroken, colour-blind icy cold, deaf-mute above-mentioned, under-ripe diehard hand-made, spoon-feed double-coat, sweet-talk overdo, outlive freeze-dry, drop-kick into, onto
96 ! Stanley Starosta et al.
These examples clearly show the right-headed property of English simple compounds. Complex compounds, compounds which have other compounds as constituents such as a novel creation like riceflour bagel, are also right-headed in that the whole compound is a noun because bagel which is a noun is its head and, in turn, riceflour is a noun because flour which is its head is also a noun. They also show that English compounds have a binary structure. In the general case if compounds in all languages are binary, it is possible for them to be either left-headed or right-headed. In the case of English, they appear to be, for the most part, right-headed. We might then hypothesize that compounds in all languages have binary structure, that they are headed and that they are either uniformly left or right headed. This then becomes a set of hypotheses about universal grammar, the grammar which all languages have in common. It is conceivable that Chinese compounds are unheaded or, if headed, that their headedness is not uniform. This is worse than a null hypothesis because it predicts that Chinese is fundamentally unlike other natural languages. If we examine those words which would be termed compounds in traditional Chinese scholarship then, in the case of simple compounds, i.e., of those compounds consisting of two and no more characters, it does appear that Chinese compounds are not headed. Their categorial composition seems for the most part random even using traditional Chinese syntactic categories. Dominating category N A V H
n L M K J
Constituents NN, AN, VN, nN, MN, NA, AA, VA, HA, nA, NV, AV, VV, HV, nV, nn, NM, MM NN, AN, VN, HN, NA, AA, VA, HA, nA, NV, AV, VV, HV NN, AN, VN, AA, VA, HA, NV, AV, VV, HV, nV, NH, Vn, LL, KK AN, VN, HN, DN, nN, NA, AA, VA, HA, JA, DA, NV, AV, VV, HV, DV, JV, NH, AH, VH, HH, nH, DH, HD, nn, LL, AJ, HJ nn NL, DL, FPL, LL NM, AM, nM, DM NK, AK, KK, KN NJ, AJ, VJ, HJ, nJ, JJ, DJ, VA, HA, NV, AV, VV, HV, DV, JV, JH, DH, HH, ND
On Defining the Chinese Compound Word ! 97
Abbreviations of categorial features which have been used in the structural descriptions of Chinese compounds are as follows (following Chao 1968): N noun, A adjective, H adverb, V verb, J conjunction, L localizer, M measure, K preposition, n numeral, D determiner, FPL free place localizer, P particle. But notice that not all categories appear wholly random. Words containing n, L and M as an immediate constituent appear to be right-headed. So it is only the major lexical categories of noun, verb, adjective, and adverb which appear to be wholly unheaded while only conjunction among the other syntactic categories appears to be unheaded. If we look at complex compounds the situation becomes clearer. Using Chaoís (1968) classification we find the following complex compounds: 1. Three-morpheme compounds This is the most productive class of complex compounds. Chao (1968: 481) divides this group into eight sub-types: (i) [N[NAN][N]] xiao3 shu4 dian3 ësmall number point, i.e., decimal pointí (ii) [N[A][NNN]] xian2 ya1 dan4 ësalty duck egg, i.e., salted duck eggí (iii) [N[VVN][N]] lu4 yin1 ji1 ërecord-sound machine, i.e., tape recorderí (iv) [N[NNN][N]] fan1 bu4 xie2 ësail cloth shoes, i.e., canvas shoesí (v) [N[VVV][N]] jiang4 luo4 san3 ëdescending-dropping umbrella, i.e., parachuteí (vi) [N[AAA][N]] suan1 la4 tang1 ësour hot soup, i.e., soup with vinegar and pepperí (vii) [N[VVA][N]] fang4 da4 jing4 ëlet large optical instrument, i.e., magnifying glassí (viii) Other compounds [N[V][NNN]] chao2 niu2 rou4 ëstir fried beefí [N[V][NAN]] ban4 huang2 gua1 ëmixed yellow melon, i.e., cucumber saladí 2. Four-morpheme compounds (i) The greatest number of four-morpheme compounds are of the type 2+2.
98 ! Stanley Starosta et al.
[N[NAN][NAN]] bai3 huo4 gong1 si1 ëdepartment storeí [N[AAA][NNN]] gong1 gong4 shi4 ye4 ëpublic careerí (ii) Of the 3+1 types, the 3 is more likely to be 2+1 than 1+2 [N[N[NnN][N]][N]] jiu3 long2 shan1 ren2 ëNinedragon Mountain Man, i.e., pen name of a painterí [N[N[A][NnN]][N]] hong2 shi2 zi4 hui4 ëthe Red Crossí (iii) Type 1+3 is rare except with certain listable versatile morphemes, often called prefixes, occurring in titles, terms of address, and so on. [N[A][N[VVV][N]]] fu4 yan2 jiu1 yuan2 ëassistant research fellowí 3. Longer Compounds (i) Five-morpheme compounds [N[N[AAA][NNN]][N]] gong1 gong4 qi4 che1 zhan4 ëbus station, bus stopí (ii) Six-morpheme compounds [N[N[NAN][N]][N[A][NNN]]] zhong1 guo2 yu3 xin1 zi4 dian3 ëChinese Language New Dictionaryí 4. Telescoped Compounds [N[NAN][N[N[NAN][NNN]][N[NVN][NAN]]]] zhong1 guo2 cheng2 tao4 she4 bei4 chu1 kou3 gong1 si1 ëChina National Complete Plant Export Corporationí These are all right-headed. It would appear then, that where Chinese compounds are headed, they are right-headed. So what is to be done with all of the apparently unheaded and leftheaded simple-compound cases? Let us look again at the traditional definition of a compound. First, many characters in Chinese are syntactically ambiguous as we said earlier. If Chinese compounds are right-headed then in the case of nouns, verbs, adjectives, and adverbs, it may be that the right-hand constituent is itself ambiguous thus making it impossible to say which of its syntactic categories is responsible for the category of the whole compound word being what it is. Second, it may be the case that not all two character words
On Defining the Chinese Compound Word ! 99
are compound words. One or both the characters may represent bound stems, as Li and Thompson (1981) suggest. In the first case this would make the word like the English word cranberry, where the first morpheme is not a possible word. The second would make the word like the English word conceive in which neither of the two syllables is a possible English word but where we might still wish to say that the word consisted of two morphemes. In other words, it may be that many of the words which are not right-headed are also not compound words. There are further possibilities. The number of available Chinese characters to represent words is large but finite. It is as likely in Chinese as it is in English that homonyms develop over time so that the same character can do service for more than one morpheme or word. For example, one of the two characters may be a bound stem which is homonymous with a free morpheme represented by the same character. For example, the English adjectives which end in the suffix -able such as comfortable would, if they were written in characters, have the suffix written with the same character as the independently existing adjective able. However, there is a clear synchronic distinction in current English between the suffix and the independently occurring adjective, whatever their historical relationship may have been. More radically, some words represented by two characters may be homonyms of other words represented by the same two characters but the word may be a single morpheme. A partly parallel case in English is the word window which was a 1,000 years ago a compound word consisting of the word wind and the word eye. If it were written in characters these two elements would still be apparent in the written form of the word whereas to current English speakers even the wind part of window appears to be only accidentally related to wind as an independent word. Let us therefore suppose that all or at least the great majority of compounds in Chinese are right-headed; how then are the individual non-headed or not right-headed words to be analyzed? Since the mnemonics used to teach characters tend to present a single underlying sense as the basic sense of each character, however implausible the semantic route to the homonymous senses represented by the same character may be, it is likely to be difficult for scholars influenced by traditional Chinese scholarship to see homonymy where it exists.
100 ! Stanley Starosta et al.
A conventional dictionary is useless for this purpose because Chinese dictionaries are dictionaries of characters, and the glosses they give for characters do not distinguish glosses for the character when it is used alone (if it ever is), glosses obtained by subtracting the meaning of its fellow morpheme from words in which it occurs, and glosses for the word as it was used in classical Chinese, hundreds or thousands of years ago. In that sense, Chinese dictionaries are etymological morpheme dictionaries that do not distinguish the modern morpheme from its etymon.6 Because of the importance of orthography in Chinese education, it is also a frustrating task to attempt to determine whether a form is free or bound by asking someone who has been socialized in such a system. In our experience, such a person will normally claim that almost any given character can be used as a free form, frequently citing Classical Chinese forms to prove his point. The only published Chinese dictionary we know of which seems to show any understanding of the difference between free and bound forms and between synchronic and diachronic stages of a language is the Dictionary of Spoken Chinese published by the US War Department in 1945. What a linguist working in Chinese needs is the linguistic nativespeaker intuitions of a Chinese speaker but of one who is also illiterate in Chinese. Failing the presence of such a person we may examine other areas of Chinese morphology to see if our hypothesis, that not all of what are claimed to be compounds are in fact compounds, can find further support.
VR Forms Mandarin VRs (ëresult compoundsí) (Gebauer 1980, Huang 1984, Lu 1977 and Thompson 1973) give further evidence that traditional analyses and modern ones based on them may be providing incorrect analyses of Chinese word-formation. In analyzing VR forms, three questions need to be answered. First, is a VR form a word? Second, what lexical class does it belong to? Third, what kinds of elements is it composed of? There is general agreement that VR forms are themselves words. VR forms are free in the sense that they are not limited in the choice of elements which can precede or follow them, and they are minimal in the sense that
On Defining the Chinese Compound Word ! 101
they cannot be further decomposed into parts which are minimal and free. VRs also satisfy further Bloomfieldian criteria in having distributions like other morphologically simple words, and in being indivisible (Bloomfield 1933: 180ñ84).7 There is also general agreement regarding de2 and bu4, elements which appear to be exceptions to the indivisibility criterion since they can appear between the V and the R. The consensus, going back at least as far as Chao, is that these elements are infixes. (Chao 1968: 159, Her 1990: 200, Lin 1990: 1, cf. Thompson 1973: 364)8 With regard to the second question, there seems to be no disagreement about the fact that a VR form is a verb, and in terms of its external distribution a verb occurring in the same kinds of environments as morphologically simple verbs, including, for example, the ability to be negated by mei2 and marked as perfect by le. There is also general agreement on the final question, namely, that the first element of a VR form is a V. The remaining question is that of the nature of the R element of a VR form. If the R is a verb, as assumed by traditionally-based analyses, then a VR form is composed of two free words and should be regarded as a compound. We will show that, in a number of cases, the R is not a verb, and not a word. That being so, a VR form of this type is not composed of two free forms, and is thus not a compound.9 To do this we will provide the same kind of analysis as in the previous section, contrasting traditionally-based analyses of VR, particularly those of Lin (1990) with one which does not make the traditional assumption that the R element is a verb but instead that it is a derivational suffix. Lin (1990) is a careful, thorough, and competent Lexical Functional Grammar analysis of VR forms. However, it adopts the traditional Chinese position on what constitutes a compound, as the following quotations illustrate. Among the traditionally categorized classes of compounds in Mandarin, there is a group of compounds often referred to as the Verb-Complement (V-R) compounds which is structurally [V1V2]V in general. Semantically, both concatenated members of the compounds have predicative functions; and, their meanings are generally that the second verb describes the state of the subject, the object, or the event as a result of the action or process described by the first verb. (Lin 1990: 1)
102 ! Stanley Starosta et al.
Lin assumes without question that VR forms must be compounds composed of two verbs and her study explains how the properties of the VR verb can be derived from the predicate-argument structures of its two constituent verbs. However, if the R is not a verb but rather a suffix, then there is no second verb and no second predicateargument structure to work from. Therefore the properties of VRs cannot be a function of the predicate-argument structures of the second verb. Linís assumptions are shared by others, for example, One-Soon Her: This semantic classification may also be relevant to the description of the morpholexical process of resultative compounding, where an action verb, [ACTIVE + PROCESS ñ], which may be either transitive or intransitive, is joined by an [ACTIVE ñ] verb, i.e., either a state verb or a process verb, to form an action-process verb, [ACTIVE + PROCESS +]. (Her 1990: 125ñ26) Once again, the author never questions the assumption that the R element is a verb and that a VR form is a compound. So, what evidence is there that the R is in fact a suffix and not a verb or even a bound verb stem? The evidence comes from problems and anomalies that appear when the assumption that it is a verb are pursued in a generative framework such as Lexical Functional Grammar. In particular, even though Lin, for example, defines VR compounds as ëstructurally [V1-V2]Ví (Lin 1990: 1), in many cases the second element of the ëcompoundí is either: (i) different in meaning from the homophonous/homographic verb occurring alone, or (ii) not a verb at all synchronically, or (iii) not even an identifiable word synchronically (cf. Thompson 1973: 363). Consequently the forms are not compounds by any standard linguistic definition, and analyses which attempt to derive them by the conflation of two free forms are untenable. To cite some specific examples: Lin derives yao3zhu4 ëbite and holdí (Lin 1990: 84, 140) from two input verbs, yao3 ëbiteí and zhu4 ëresideí. However, it takes a lot of semantic ingenuity to see any connection between the meaning of the verb ëresideí and the meaning of ëpersisting resulting stateí that is shared by verbs suffixed by -zhu4.10 In this situation, we might take refuge in the fact that compounds are typically not semantically compositional (cf. Chao 1968: 278), and state that the -zhu4 of yao3zhu4 really is the verb ëresideí,
On Defining the Chinese Compound Word ! 103
but that the compound has undergone a semantic shift. The contrary view, that -zhu4 is a suffix in resultative verbs, is supported by the fact that its meaning in these verbs is perfectly regular; the R suffix -zhu4 derives VR verbs with the meaning ëpersisting state resulting from the action Ví from their verb stems, and speakers derive perfectly regular new forms by this same pattern. Thus the compounding analysis may (or may not) preserve etymological information, but the suffixation analysis captures information about the speakerís synchronic competence, the regularity of the sense of the suffix, and its productivity. If this is a relatively simple conclusion to reach, why have linguists working with Chinese classifications of word formation not drawn it? Mainly, we believe, because of preconceptions about the nature of Chinese word-formation. In the cases where the suffixal nature of a form is so obvious that it becomes very difficult to maintain the equation of a character with a word or bound stem, suffixes are allowed, but only with a new definition that limits the range of application of this term. For example, Chao limits suffixes in Chinese as follows: ëA suffix in Chinese is an empty morpheme, mostly in the neutral tone, which occurs at the end of a word and characterizes its grammatical functioní (Chao 1968: 219). Why should a suffix be defined as ësemantically emptyí? Certainly it is not generally true that suffixes in other languages are ësemantically emptyí. Similarly, the general definition of ëaffixí does not require that affixes bear neutral tone or that they ëcharacterize grammatical functioní. If a suffix such as -ju4 is linguistically distinct from the verb written with the same character, and never occurs as a free form, how is one to decide its properties as a word? One common method is to look up the meaning the character represented in the Chinese language of a 1,000 years ago. The other procedure is a kind of subtractive inference grounded in the equation, WORD = CHARACTER. The logic goes as follows: In a VR form, V and R are both words, since they are written with characters. Some of the Rs correspond to free verbs (or adverbs), so all Rs are verbs (or adverbs). The meaning of the posited R verb is derived by subtracting the meaning of V from the overall meaning of VR, and the predicate-argument structure of this hypothetical verb is then extrapolated from the meaning so derived. This is, for example, how Lin arrives at the meaning of R elements and at their contribution to the meaning of the whole VR form.
104 ! Stanley Starosta et al.
It seems to us that the predicative functions in V-R compounds must be established on the SEMANTIC properties of the two verbal items involved. Even the argument structures of V-R compounds in which both components are intransitive seem to be determined in this way. (Lin 1990: 46) In the following sections, I will introduce the notion of lexical semantic defaults (Dowty 1988) and classify verbs into a small set of semantic classes according to the lexical semantic defaults. Then, based on the semantic classes, the predicative functions of the R members in V-R compounds will be predicted and the possible thematic structures of the compounds will be constructed. (Lin 1990: 80) A study such as Linís, then, is circular, first performing the operation VR ñ V = R, and then commuting the operation: VR = V + R to ëpredictí the meaning of its constituents. I will discuss the compounds with the following assumptions: (1) V-R compounding involves combinations of two lexical entries including the two thematic structures and the semantic information contained, (2) the first member of the compounds is HEAD and the second one COMPLEMENT, [1] and (3) the thematic structure of the HEAD member will unify with that of the COMPLEMENT member to produce a new thematic structure. (1) THE MORPHOLOGICAL OPERATION: V1 ó> V1-V2;... (2) THE COMBINATION OF THEMATIC STRUCTURES: a. V1 + V2 ó> V1-V2 ... ÷ PAT (Lin 1990: 80, 113ñ14)11 Given the questionable ontological status of the R verbs identified in this procedure, it is not surprising that there are sometimes problems in setting them up, e.g., While it is a simple matter to determine the form class of a compound as a whole, the form classes of its constituents are not always clear or determinate and consequently sometimes their syntactical relation is unclear or ambiguous. This is a reservation we have
On Defining the Chinese Compound Word ! 105
to make in all the headings and subheadings of the syntactic compounds detailed below. (Chao 1968: 366, 367) Lin notes the difficulty in predicting the meanings of VRs from their components. Based on past works, it is quite difficult to establish a rule-governed interpretation process for V-R compounds, because the subcategorization frames of the compounds are not formed by straightforwardly concatenating the frames of the V member and the R member. (Lin 1990: 1ñ2) The [chou2]V1-[bai2]V2 [ ] ... in (20)a and the [chou2]V1[si3]V2 [ ]... in (20)b share the same V member; in another pairs, the [chou2]V1-[si3]V2 ... in (20)b and the [nan2]V1[si3]V2 [ ] ... in (20)c share the same R member. The argument structures of these compounds are not decidable by the V member or the R member alone. (Lin 1990: 47) This unpredictability is again partly due to the assumption that both components are verbs, so that it is hard to predict the meaning of nan2si3 [ ] ëextremely difficultí from nan2 ëdifficultí and si3 ëdieí. However, when the second element is regarded as the separate derivational suffix -si3 which synchronically has nothing at all to do with dying and which adds the meaning ëextremelyí, the process turns out to be quite regular. At several points Lin appears close to recognizing the fact assumed in the Dictionary of Spoken Chinese analysis that a single character may represent two quite different linguistic elements, a free verb with one meaning and a suffix, indicated by a preceding hyphen, with another not necessarily related meaning.12 At least this seems to be an obvious interpretation of her tabulation on p. 9 for example, (Lin 1990: 9, slightly modified, characters added): shang4 Xia4 jin4 chu1 qi3 hui2 guo4 kai1 long3
ëascendí ëdescendí ëenterí ëexití ëriseí ëreturní ëpassí ëopení ëgatherí
ë-upí ë-downí ë-iní ë-outí ë-upí ë-backí ë-overí ë-away, apartí ë-togetherí
106 ! Stanley Starosta et al.
Linís example following the table, zou3 ëwalkí, kai1 ëopení, zou3kai1 , ëwalk awayí is an excellent illustration of our point. zou3kai1 has nothing to do with opening anything, but all VRs ending in the suffix -kai1 seem to refer regularly to a separation between objects resulting from the action of the V element of the VR form.13 Admittedly, not all of Linís examples are of this type. Thus RVs with -jin4 do probably share a meaning of entering something. However, a unified analysis attempting to capture the synchronic semantic regularities would have to treat all of these forms as suffixes, resorting to etymology to account for any semantic relation between the R and a corresponding free verb which is written with the same character. Note that this is not to claim that the meanings of VR words derived by suffixation are 100 per cent predictable, since derivation, like compounding, is typically not fully productive or compositional. The claim is rather that not formally tying the meaning of suffixes to their etymological sources results in a far higher degree of generalization than is gained in more traditionally based analyses. In support of such an analysis the following examples show how it might be carried further. This list of verbs and suffixes, which are written with the same character in Chinese orthography, must be distinguished according to linguistic criteria. They are, in our view, homonyms. Glosses for the meanings as verbs and as suffixes are based partly on the Dictionary of Spoken Chinese. Verb gloss cha2 dao3 dao4 diao4 fu2 guang1 hao3 ji2 jin4 zhu4 kai1 liang2 liao3 ni4 qi3 shang4
ëlook upí ëtoppleí ëarrive atí ëfall, dropí ëbe convincedí ëbareí ëgood; finishedí ëbe urgentí ëenterí ëresideí ëopení ó (ëenoughí) ëbother, boreí ëriseí ëascend; go toí
Derivational suffix gloss ëexamineí ëdown, overí ëto, until; succeedí ëoffí ëoverí ëcompleted, exhaustedí ëfinishedí ëextremelyí ëinto; ahead; to the endí ëpersisting resultí 14 ëapart, awayí ëbetter; improvedí ëbe possibleí ëbored from doingí ëbeginí ëup; togetherí (Contd.)
On Defining the Chinese Compound Word ! 107 (Contd.) Verb gloss si3 tou4 tong1 wai1 ya3 zou3
ëdieí ëair passesí ëpass throughí ëaskewí ëlow, quiet, muteí ëwalk, goí
Derivational suffix gloss ëdead; extremelyí ëthoroughly, completelyí ëthorough effectí ëaskew, lopsided, awryí ëhoarse, huskyí ëaway, through, fromí
Our derivational analysis is an extension of the one proposed by Thompson (1973), and thus might be thought to be open to Linís criticism of this approach: ëThompsonís approach leaves the correlation between the predicative functions of complement members and the behaviours of their corresponding compounds unexplainedí (Lin 1990: 37). However, in our view, the R element is not a complement and has no predicative function, so there is in fact nothing to explain; Linís ëexplanationí of this supposed relation, as we have shown above, is simply circular. We claim, instead, that a derivational analysis, embedded in the framework of a universal theory and divorcing itself from the vagaries of etymology, is able to capture more and better generalizations about the relations between VRs and their components than one which resorts to etymology. The word-and-paradigm analysis we assume again might seem to be open to another of Linís criticisms: ëThus, if treating them as the compound verbs without internal structure, then how to explain the relation between the syntactic behaviours of this composition type and the predicative functions of the complement members?í (Lin 1990: 33) The answer to Linís rhetorical question is that derivational rules relate the semantic, syntactic, and phonological properties of source words to those of derived words directly via Derivational Rules (DR), analogical patterns of word-formation (Starosta 1988: 90ñ96), without ever having to refer to internal structure. As an example, the following DR would account for Linís zou3kai1 and other -kai-suffixed forms: DR-1 é+V ù : é+V ù ú ê ú ú ë-telc û ê+telc ú ê ú ë+sprtû ] : ka∂]
108 ! Stanley Starosta et al.
This rule can be read as ëcorresponding to any atelic (non-result) verb, there may be a telic verb identical to the source verb but ending in kåi and differing in meaning by the addition of a component of ëseparationí. Since this is a lexical derivation rule, it has the conventional properties of derivation, allowing some semantic deviation and lexical gaps. Note that there is no internal structure referred to here at all, and none is needed. The rule is a statement of an analogical pattern of correspondence between two sets of verbs, a pattern which can be used in recognizing or creating new VR forms. Note too that this formalization automatically accounts for the nonrecursivity of the process, something that Linís approach does not seem to handle. The rule applies only to non-result verbs, but the forms it produces are result verbs, and thus not eligible to undergo the same rule. Derivational Rules are established for a language by first independently classifying all verbs syntactically, without regard for whether they are morphologically simple or complex. The linguist then looks for subsets of words under one branch of the classification that have a fairly regular one-to-one syntactic, semantic, and phonological correspondence to one another, then writes rules that will associate those branches. The result for Chinese will be Derivational Rules such as DR-1. As in the case of Lexical Functional Grammar these rules typically refer to argument roles, though the case relations of lexicase dependency grammar are grammatically based, unlike the thematic relations used in Lexical Functional Grammar and Government and Binding theories. There are further advantages in the analysis which we propose having to do with the argument linking properties of VR forms. In addition to Lin (1990: 109ff), Chang (1991: Chapter 2), and Li (1990: 178, 190) also claim that complex internal structures must be posited for VR forms. These structures typically require head-feature percolation and may include, for example, multiple occurrences of the same theta-role which are marked as identical to each other (ëfusioní) or which do not get assigned externally at all (ësuppressioní), or different theta-roles which get assigned to the same external actant (cf. Baker 1989). For example, Chang observes that: ëcomplex verbs, on surface and in terms of grammatical functions, do not differ from simple verbsómost of them take a subject and an object. What makes them distinct from simple verbs is their complex internal thematic structuresí (Chang 1991: Chapter 2, Sec. 2.5). She
On Defining the Chinese Compound Word ! 109
has no real explanation as to why her complex verbs fit into the same pre-existing classes that can be established for simplex verbs. Given the power of her eclectic descriptive system, there is certainly nothing that would formally require this result. In lieu of an explanation, she gives a list of language-specific ad hoc stipulations, which she refers to as ëprinciplesí, in order to reconcile her output to her observations. Powerful and complex analyses such as the ones proposed by Lin, Chang, and Li are only valid until such time as an alternative analysis which accounts for the same patterns without assuming such power and complexity is produced. A lexicase analysis such as that of Ng (1992) shows that the same phenomena which Lin and Chang can only account for with complex internal structures fall out naturally from a constrained dependency analysis which posits only five case relations. If we compare, for example, Liís (1990) treatment of thematic relations in VV compounds with our and Ngís treatment of these as suffixed verbs, we will see that much of the additional machinery required to account for VV verbs (on the assumption that the R element is a V) is not necessary on the assumption that Rs are suffixes. The assumption that the second element is a verb leads to analyses which are in breach of major constraints on linguistic theory such as the Theta Criterion which requires that, ëEach argument bears one and only one theta role and each theta role is assigned to one and only one argumentí (Chomsky 1982: 36). Since it is not manifest that R elements are verbs, the breach of the Theta Criterion provides theory-internal support for the analysis of R as a suffix. Li (1990), for example, makes claims following from the R element being analyzed as a verb as follows: ëA remarkable property of these compounds is that there seem to be more theta-roles available than overt NP argumentsí (Li 1990: 178). However this is only remarkable if the R element of VR forms is in fact a verb. Our analysis claims that it is not, so there are no extra arguments left over to account for. Naturally, if Li is correct then, ëThe question is how the Theta Criterion is satisfied, which requires that each theta-role is assigned to an argumentí (Li 1990: 178). In our analysis, the number of nominal arguments required by a VR form is specified by the derivation rule itself. The number could be the same as the number of arguments allowed by the V-element of VR, or more or less than that number, but never more than five (Starosta 1988: 126).15 Each nominal argument is projected from one and only one of the maximum of five
110 ! Stanley Starosta et al.
possible theta-role slots available in the case frame of the derived VR verb, so no violation of the Theta Criterion is formally possible. Liís problems with the Theta Criterion which arise from supposing the R element to be a verb lead to the introduction of a number of ad hoc ëindependently justified assumptionsí (Li 1990: 178). In our analysis, no such ad hoc assumptions are required. The analysis we propose is consistent with the strict constraints of the lexicase theory without any additional ad hoc stipulations. Liís assumptions include ëthree particular assumptions which will be used in the rest of the paper. Of the three, theta-identification is the mechanism for ìreducingî the number of theta-roles a V-V compound actually assigns to its arguments. The other two, a structured theta-grid and the head-feature percolation, interact so as to correctly restrict the patterns in which the theta-roles of the two components of such a compound are identified and assignedí (ibid.). By contrast, in a lexicase dependency grammar, the inventory of case relations (ëtheta-rolesí) is limited to five grammatically determined universal roles, and there are only two possible subject case roles, agent for transitive verbs and patient for intransitives (Starosta 1988: 181). Any new derived verb must universally conform to these requirements, so no languagespecific stipulations are required to give this result. Liís observations about the prominence of ëThemeí in object-incorporation verb compounding follows from the other axiomatic assumption of the lexicase case-relation system, Patient centrality: every verb takes a Patient in its case frame, and Patient is the central pivotal case relation, the one which is the scope for other complement-case relations (ibid.: 128). Thus, there is no need in our analysis for a ëstructured theta-gridí. Li also claims that: ... the relevant features of the head will be maintained throughout its derivation projection is widely assumed in linguistic practice. Therefore, I will follow this convention by assuming that the head of a compound word determines the fundamental properties of the compound. In fact, I will assume a (probably) strong version, which requires that the theta-role prominency of the head must be strictly maintained in the theta-grid of the compound. (Li 1990: 181)
On Defining the Chinese Compound Word ! 111
If, however the R element of VR compounds is a suffix, then the Vconstituent is the only constituent which is a lexical item with its own case frame (ëtheta-gridí), and the properties of the derived V-R form are a function of the properties of the V. There is no need therefore for a mechanism to subordinate the lexical properties of the R component, since it has no lexical properties of its own to suppress. There is further support for supposing that R elements are suffixes, and it comes from the first section of this paper. If Chinese compounds are right-headed, then VR compounds constitute a class of counter-examples to this claim since most scholars who suppose them to be compounds suppose that it is their left-hand V constituent which is the head of the compound. If, however the R element is a suffix, then in many generative treatments, such a suffix should be the head of the derived word of which it is the right hand constituent since headedness in word structure must be either uniformly left- or right-headed. This again provides theory-internal evidence that what are often termed, following traditional Chinese analyses, ëVR compoundsí, are not compounds but derived words.
Conclusion We have shown that traditional Chinese definitions of compounds and words, and contemporary analyses which take those definitions for granted, can be a hindrance to gaining an understanding of Chinese compounds in general and of VR compounds in particular. We have also shown that applying standard linguistic criteria to these constructions results in a significantly different analysis, one which distinguishes between synchronic linguistic competence and etymology, and which conforms to a radically constrained grammatical framework, thus staking a credible claim to explanatory adequacy.
Notes 1. An earlier version of this paper was presented at the First International Conference on Chinese Linguistics, University of Singapore, in June 1992, and the final version has appeared in Jerome L. Packard
112 ! Stanley Starosta et al.
2. 3.
4.
5.
6.
7.
8.
(ed.), New Approaches to Chinese Word Formation: Morphology, Phonology and the Lexicon in Modern and Ancient Chinese. Trends in Linguistics. Studies and Monographs. Berlin: Mouton-de Gruyter, pp. 347ñ70. This version is to be reprinted in Rajendra Singh and Stanley Starosta (eds), Explorations in Seamless Morphology. New Delhi: Sage. We would like to thank Lillian Meei-chin Huang for careful reading and comments, Tsai-fa Cheng for helpful discussions and criticism in connection with our views on the impact of traditional analyses on current Chinese linguistics, Ke-sheng Wang for checking some of the pinyin orthography, and Keiko Machizukiís Tokyo University of Foreign Studies students for pointing out an error in character choice. All romanized forms citing or quoted in the paper have been converted to pinyin orthography, and tone numbers have been added. Since a character represents a single syllable, this supports the commonly accepted characterization that Chinese is a monosyllabic language. For an informed and linguistically well-founded description of the Chinese writing system, see DeFrancis (1989). There are a few recognized exceptions to this, such as the noun suffix -zi and the two syllables of such disyllabic forms as hu2die2 ëbutterflyí, xi1shuai4 ëcricketí, and dong1xi, ëthingí, but the general assumption is that a character represents a word. The distinction between simple and complex compounds is made by Kuiper (1972) on the ground that the structural properties of complex compounds in English are more constrained than those of simple compounds. Regrettably this tradition continues in some modern dictionary projects. However, according to Chu-ren Huang (personal communication), the CKIP dictionary now being used and developed at the Academia Sinica in Taiwan is word-based rather than characterbased. It distinguishes free from bound forms and has morphological rules for deriving suffixed forms in -xing and -du as well as determiner-measure compounds. Because of technical considerations, it is not consistent in providing distinct lexical entries for homophonous words, but they are given separate entries when they belong to different grammatical categories. Note, however, Bloomfieldís caveat that ëNone of these criteria can be strictly applied: many forms lie on the border-line between bound forms and words, or between words and phrasesí (Bloomfield 1933: 180ñ84). Chao has an apposite example from Cantonese showing that the rule is conditioned by the phonological shape of the word rather than by some particular kind of internal lexical structure (Chao 1968: 163). ëIn Cantonese the first syllable can be repeated alone, even though bound.... This is possible in Cantonese even if it does not make good sense,
On Defining the Chinese Compound Word ! 113
9.
10.
11. 12.
13.
14. 15.
as Nee dhoo-mu-dhoongoh? ìAre you hungry?î, where dhoongoh means literally ìstomach [dhoo] hungry [ngoh]î.í This definition has been loosened somewhat to accommodate cases in which compounds are composed of stem forms, words with their inflectional affixes missing. This situation could also conceivably arise in Chinese if Starosta is correct in his claim, which has so far gone uncontested, that Mandarin Chinese has case inflection (Starosta 1985). However, we will not further consider this possibility here. The Dictionary of Spoken Chinese does cite one example which may represent the more direct etymological source of this verb: ëjù [zhu4] (I) to stop, stay, live. Sometimes causatively: jù-dzwe:y-ba! [zhu4zui3ba] ëShut up!í However, none of our consultants has ever proposed such a relation, so it is doubtful whether there is any synchronic validity to this connection. Similar argumentation is to be found in the VR analyses of Chang (1991), Li (1990) and Sproat and Shih (1993). Most of these suffixes are glossed as English ëparticlesí. This is quite appropriate, since the status of English ëparticlesí is very similar: they are typically formal markers of lexical derivation. The meaning they bear is rather vague but relatively consistent, just as is the case with VR derivational suffixes. The TwinBridge on-line Chinese-English dictionary has the following sub-entry for : ësuf. away; off; out: zhèi wµu tài xi∑o, w˚o men zuò bu ~ = The room is not big enough to seat all of us.í ësuf. tightly; firmly: zhuå ~ = grasp firmly: zhuå bu ~ = unable to grasp (TwinBridge English-Chinese Dictionary). Williams (1981b) also proposes that suffixes can change the argument structure of their stems.
References Anonymous. 1945. Dictionary of Spoken Chinese: Chinese-English, English-Chinese. War Department Technical Manual TM 30ñ933. Washington: United States War Department. Baker, Mark C. 1989. ëObject sharing and projection in serial verb constructionsí. Linguistic Inquiry. 20: 513ñ53. Bloomfield, Leonard. 1933. Language. New York: Henry Holt and Company. Chang, Hsun-huei. 1991. ëInteraction between syntax and morphology: A case study of Mandarin Chineseí. Ph.D. dissertation. Honolulu: University of Hawaiëi.
114 ! Stanley Starosta et al. Chao, Yuen Ren. 1968. A Grammar of Spoken Chinese. Berkeley: University of California Press. Chomsky, Noam. 1982. Lectures on Government and Binding. Dordrecht: Foris. DeFrancis, John. 1989. Visible Speech: The Diverse Oneness of Writing Systems. Honolulu: University of Hawaiëi Press. Di Sciullo, Anna-Maria, and Edwin Williams. 1987. On the Definition of Word. Cambridge, Mass.: MIT Press. Dowty, David. 1988. ëThematic proto-roles, subject selection, and lexical semantic defaultsí. Paper presented at the 1987 colloquium. The Twentysecond Annual Meeting of the Linguistic Society of America, San Francisco. Gebauer, Margaret See. 1980. ëThe lexical phraseósyntactic manoeuvering in the lexiconí. Cahiers Linguistique díOttawa. 9: 127ñ51. Her, One-Soon. 1990. ëGrammatical functions and verb subcategorization in Chineseí. Ph.D. dissertation. Honolulu: University of Hawaiëi. Hockett, Charles. 1950. ëPeiping morphophonemicsí. Language. 26: 63ñ85. Huang, C. and T. James. 1984. ëPhrase structure, lexical integrity , and Chinese compoundsí. Journal of the Chinese Language Teachers Association. 19: 53ñ78. Kuiper, Koenraad. 1972. ëRules of English noun compounds: Implications for a theory of the lexiconí. Ph.D. dissertation. Simon Fraser University. Li, Charles N. and Sandra A. Thompson. 1981. Mandarin Chinese: A Functional Reference Grammar. Berkeley: University of California Press. Li, Yafei. 1990. ëOn V-V compounds in Chineseí. Natural Language and Linguistic Theory. 8: 177ñ207. Lieber, Rochelle A. 1981. On the Organisation of the Lexicon. Bloomington, Indiana: Indiana University Linguistics Club. óóó. 1992. Deconstructing Morphology. Chicago: University of Chicago Press. Lin, Fu-wen. 1990. ëThe verb-complement (V-R) compounds in Mandariní. M.A. thesis. Taiwan: Linguistic Institute, National Tsing Hua University, Hsinchu. Lu, John H-T. 1977. ëResultative verb compounds vs directional verb compounds in Mandariní. Journal of Chinese Linguistics. 5: 276ñ313. Mathews, R.H. 1960. Mathewsí Chinese-English dictionary. Cambridge, Mass.: Harvard University Press. Ng, Siew Ai. 1992. ëVerb subcategorization and derivation in Singapore Mandarin: A Dependency grammar analysisí. Ph.D. dissertation. Honolulu: University of Hawaiíi. Selkirk, Elizabeth. 1982. The Syntax of Words. Cambridge, Mass.: MIT Press. Sproat, Richard and Chilin Shih. 1993. ëOn the sources of some constraints in Mandarin morphologyí. Paper presented at the Third International Symposium on Chinese Languages and Linguistics. July. Taiwan: National Tsing Hua University.
On Defining the Chinese Compound Word ! 115 Starosta, Stanley. 1985. ëMandarin case marking: A localistic lexicase analysisí, Journal of Chinese Linguistics. 13: 215ñ66. óóó. 1988. The Case for Lexicase. London: Pinter Publishers. Thompson, Sandra A. 1973. ëResultative verb compounds in Mandarin Chinese: A case for lexical rulesí. Language. 49: 361ñ79. Williams, Edwin. 1981a. ëOn the notions lexically related and head of a wordí. Linguistic Inquiry. 12: 245ñ74. Williams, Edwin. 1981b. ëArgument structure and morphologyí. The Linguistic Review. 1: 81ñ114.
116 ! Stanley Starosta
6 Do Compounds have Internal Structure? A Seamless Analysis1 Stanley Starosta History Traditional grammars of classical European languages described word shapes in terms of PARADIGMS and PRINCIPLE PARTS, an approach often referred to in modern linguistics as WORD AND PARADIGM MORPHOLOGY, or ëWPí (cf. Hockett 1954: 386, Matthews 1974: 136ñ37 and Robins 1959: 119, 144). American ëstructural linguisticsí innovated an approach to word shapes, i.e., ëmorphologyí, which was based on slicing words into typically non-overlapping items, MORPHS, and grouping these morphs into classes, or MORPHEMES. Two versions of this approach are referred to as ëitem and arrangementí or ëIAí, and ëitem and processí or ëIPí.2 The distinction is illustrated schematically in Figures 6.1 and 6.2. More recent Chomskyan treatments of word structure by Bralich (1991), Lieber (1981), Selkirk (1982), Williams (1981), etc., add the notions of hierarchy (analysis in terms of immediate constituents), which was present in some earlier structuralist approaches, and ëheadednessí, which allows a morpheme which is an immediate constituent of a word-internal configuration of morphemes to be chosen as the ëheadí of that structure, the item serving as the repository of the properties of the configuration as a whole. This can be represented schematically as in Figure 6.3.
Hierarchical IA These sausage-slicing approaches, which we may refer to collectively as BENIHANA MORPHOLOGY, work fairly nicely with languages such as
Do Compounds have Internal Structure? ! 117 WordA1 WordA2 WordA3 WordA4 Figure 6.1: Word and Paradigm
stem A
affix 1
stem A
affix 2
stem A
affix 3
stem A
affix 4
Figure 6.2: Item and Arrangement
stem A
affix 1
stem A
affix 2
stem A
affix 3
stem A
affix 4
affix 1 stem A Figure 6.3: Item and Arrangement
Turkish, which are referred to as morphologically ëagglutinatingí. From the very beginning, however, this approach to word structure has been plagued with difficulties when it was applied to languages such as English which donít always exhibit this kind of pattern. Over the years, a number of increasingly abstract and implausible ad hoc
118 ! Stanley Starosta
ëpatchesí have been added to the basic mechanism to accommodate such uncooperative languages, including discontinuous morphs and stems, zero morphs, replacive and process morphemes, etc. One of the more recent of these patches (which I will refer to as SUBWAY MORPHOLOGY) adds a horizontal dimension to the analysis, replacing the sausage by a submarine sandwich with two interlocking layers in order to account for e.g., Semitic ëtri-consonantal stemí word structure (see McCarthy 1981), e.g., (1) Stratified IA, Syrian Arabic k
a
t
a
b
katab ëhas writtení
k
u
t
i
b
kutib ëhas been writtení The increasing power and adhociery of such analyses made some linguists start wondering whether the traditional WP analysis might not have been abandoned prematurely, and alternative approaches were proposed in which the Benihana and Subway versions of IA morphology were replaced by formalized versions of WP.3 One framework which explicitly rejects the idea that words have any internal structure at all is lexicase dependency grammar (cf. Starosta 1988). A framework that has also moved in this direction is Stephen Andersonís ëA-morphous morphologyí (cf. Anderson 1992 and references cited there). Anderson was not able to dispense with internal word structure entirely because, I think, he is too strongly anchored within the Chomskyan system of language analysis, which is in many ways still very structuralist in its approach to natural language. There are two areas where the structuralist legacy in Andersonís approach is especially encumbering: (1) the treatment of ëstemsí, and (2) compound formation. In this paper, I will indicate problems with Andersonís analysis in both of these areas, and propose strictly nonsegmental alternatives that do not have the same problems.
Do Compounds have Internal Structure? ! 119
Stems versus Principal Parts Stephen Anderson characterizes his approach to compounding, adapted mainly from Selkirk, as ëword-basedí, e.g., ë... Selkirkís notion of word-internal phrase-structure grammar ... does not invoke any categories (such as ërootí, ëstemí, ëaffixí, etc.) other than those at the ëwordí ëlevelí (Anderson 1992: 298). In fact, however, his general analysis is not word-based at all. This is because the basic unit which is stored in memory and serves as the input to Word Formation Rules is, by his own admission, not a word but a STEM: The correct move here would appear to be to say that it is not words but stems that function as the base of Word Formation Rules. An appropriately constrained notion of stem, in turn, seems to be ëword minus (productive) inflectional affixationí. (Anderson 1992: 71) A stem is not a word but an abstract underlying form, and a theory that takes stems as the basic unit is not word-based. A stem is a truncated sausage in the Benihana IA approach to word structure. A stem may sometimes be coextensive with a word, but sometimes it is a subcomponent of a word, that is, a morpheme or a sequence of morphemes in the classical structuralist sense. Andersonís position can be contrasted with the lexicase approach, by which words in a paradigm are not related to each other through the incorporation of a common underlying ëstemí, but rather by formmeaning analogies obtaining between pairs or among n-ads of words, e.g., aAb
:
[aF i ] aBb
[aF i ]
::
[bF j ] :
[aF i ] aCb
cAd
cBd
::
[bF j ] :
cCd
::
[bF j ]
Figure 6.4: WP Analogical Patterns
120 ! Stanley Starosta
In this schematic diagram, a and b represent whatever elements of word shape (not necessarily segments or a single contiguous element) are common to all the words on the left side of the analogy, and c and d represent whatever elements of word shape are common to all the words on the right side of the analogy. Similarly, [aFi] and [bFj] represent whatever elements of word meaning or distribution are common to all the words on the respective left and right sides of the analogy. A, B and C represent whatever elements of word shape are common to the pairs of words in the analogy. As a concrete example, we might take regular present and past tense English verbs, e.g., slIpt ëslippedí [+past]
:
slIp ëslipí [ñpast]
::
mIst ëmissedí [+past]
:
mI s ëmissí [ñpast]
::
w kt ëwalkedí [+past]
:
w k ëwalkí [ñpast]
::
... Figure 6.5: WP Analogical Patterns: t Past and Zero Non-Past
Here, a and b correspond to the word-final segment t and c and d to the lack of such a segment. Similarly, [aFi] and [bFj] correspond to [+past] and [ñpast], and A, B and C to slIp, mIs, and w k respectively. The pattern exhibited by these related sets of words can be summarized by an analogical WFS. A first approximation for this WFS would be: WFS (1) t past : Æ non-past4 [+past] : [ñpast] t] : ] that is, corresponding to a word with the feature [+past] and ending in t, there is another word with the feature [ñpast] which has the same shape except that it lacks the t. A WFS summarizes the shared properties of a set of analogically related word pairs (or n-ads), and accounts for a speakerís ability to recognize new members and add new items to the list on either side of the colon.
Do Compounds have Internal Structure? ! 121
Similar analogies would apply for sets of words that differ in other ways, e.g., ræ ërangí [+past]
:
rI ëringí [ñpast]
::
sæ ësangí [+past]
:
sI ësingí [ñpast]
::
sæt ësatí [+past]
:
sI t ësití [ñpast]
::
. . . Figure 6.6: WP Analogical Patterns: Ablaut Past and Zero Non-Past
The pattern exhibited by these related sets of words can be summarized by a WFS (first approximation): WFS (2) Ablaut æC] past : I C] non-past [+past] : [ñpast] æC] : I C] that is, corresponding to a word with the feature [+past] and containing the vowel æ, there is another word with the feature [ñpast] which has the same shape except that it contains the vowel I in the same position as the æ in the related form. Such analogical formulae are not biased toward agglutinating languages, and work without modification for English affixation and ablaut and Semitic ëtri-consonantal stemí morphology, for example, (cf. Zvaig 1994). At no point is there any reference to a ëstemí which is smaller than and more abstract than a full word.
Compounds Compounds as Exceptions to Pure A-morphous Morphology The second area in which Andersonís framework differs significantly from a pure WP analysis is in the area of compounds and noncompound ëcompositesí (Anderson 1992: 29). As in his approach to affixation, he accepts the widespread orthodoxy regarding such
122 ! Stanley Starosta
word-formation. This dogma is stated within the Minimalist Programme by Randall Hendrick: Everyone agrees that at least some words have internal structure. Compounds are good candidates for words with internal structure since a word like lighthouse is composed of two forms that are ëfreeí in the sense that they appear elsewhere as independent words. (Hendrick 1995: 301) For Anderson, compounds are composed of free forms, words with their own syntactic category, and he regards it as necessary to preserve the lexical integrity and syntactic category information of these free forms even when a word is incorporated into a larger word: ... some words must be assigned internal structure. The canonical case of this sort is that of compounds, as traditionally construed: words made up of other words. We proposed that these should be assigned a structure by a set of Word Structure Rules, quite distinct from Word Formation Rules. Such rules have much the same form as phrase-structure rules, but apply only to lexical categories. Word Structure Rules operate to specify the internal constituency of lexical categories and (optionally) their heads. (Anderson 1992: 299) The present chapter deals with the fact that unlike affixed words, compounds do provide motivation for assigning internal structure to words. (Anderson 1992: 293) Compounding ... involves the combining of stems from the lexicon into a quasi-syntactic structure. This word-internal structure seems to be unique to compounds, in fact.... The formation of compounds seems to involve a genuinely syntactic combination of lexical items below the level of the word (perhaps along lines like those explored in Selkirk 1982), while non-compounds have only a phonological structure. (Anderson 1992: 292) Anderson himself recognizes that this is a significant deviation from the ëa-morphous morphologyí direction, but considers it to be an unavoidable one. In the remainder of this paper, I will try to show that in fact it is not: compounds, like other words, can and should be analyzed as having no internal structure. In order to demonstrate
Do Compounds have Internal Structure? ! 123
this, I will list and illustrate the main kinds of evidence Anderson presents for the necessity of requiring compounds to have internal structure, and then present alternative lexicase WP analyses which capture the same or better generalizations without requiring an internal structure analysis.
Headedness One point that Anderson presents as clear evidence for an internal structure analysis for compounds is ëheadednessí. As has often been noted in the morphological literature going back to Påƒini, some but not all compounds are ëheadedí. What this means is that one of the two or more free forms entering into the compound contributes its part of speech and significant parts of its meaning to the compound as a whole, while the other components typically only narrow the meaning of the whole word and may belong to different word classes. An example of this would be blackberry, a compound related to the words black [Adj] and berry [N]. This is a headed compound because blackberry, like berry, is a noun and a kind of berry. Anderson concludes, in the Chomskyan morphological tradition, that words such as blackberry must be analyzed as having internal structure, e.g., (2) [N[Adj black] [N berry]]
Word Structure Rules: In the Chomskyan morphological tradition, as exemplified in the work of Selkirk (1982) and Williams (1981) for example, such compounds are formed by a separate set of rules, referred to by Anderson as ëWord Structure Rulesí, and the headedness properties are accounted for by somehow getting features of one of the components to ëpercolateí up to the level of the compound as a whole. Unfortunately it is hard to compare Andersonís version of this analysis with a lexicase analysis to be presented below because the former exists so far only in programmatic form: The point of these remarks is not at all to present a complete theory of Word Formation Rules applying to compounds, or of analogical rules of compound formation, since no such theory has been developed in any comprehensive form at present. (Anderson 1992: 297ñ98)
124 ! Stanley Starosta
Anderson does give us several hints of the kind of formal analysis he has in mind. In fact, he outlines two apparently distinct approaches, without indicating how they can be reconciled. The first is a conventional Chomskyan approach in terms of constituent structure rules: ... we propose to let compounding be accomplished not by the operation of Word Formation Rules, but rather as syntactically structured objects described by phrase-structure rules such as (1) which describes the formation of Noun compounds both of whose constituent parts are Nouns. (Anderson 1992: 296) (3) (1) N® NN
Problems: This formulation seems intended to cover both syntactic and lexical compounding processes, ë... the compounds ... are structures whose formation takes place ... either in the lexicon or in the syntaxí (Anderson 1992: 297). This implies a reversion to the powerful syntactic models of the late 1960s, in which transformations could create lexical items. Rule (3) is presumably only intended as a schematic example, since for instance it says nothing at all about the crucial question of feature percolation. However, even this sketchy formulation is explicit enough to demonstrate the descriptive inadequacy of such an analysis, since (a) it does not account for headless ëbahuvrihií compounds (Anderson 1992: 317), (b) for the fact that inflection typically does not carry over in compounding, or (c) for phonological adjustments that may occur in compounding (Schwanengesang, Greco-Roman, hippopotamus). Such phenomena require the creation of a separate ad hoc rule component whose formal properties are even vaguer than those of ëWord Structure Rulesí: Compounding processes ... may be characterized by additional elements, not directly associated with any of the compound items but rather with the compound structure itself. These elements are motivated only by the word- formation pattern itself, in which they serve as a sort of morphological ëglueí rather than as independently meaningful elements. (Anderson 1992: 41) The answer to this problem does not require us to recognize Word- Formation Rules that create compounds, but only that we recognize such rules that apply to them. Thus German and many
Do Compounds have Internal Structure? ! 125
other languages have Word Formation Rules that perform changes like that in (2) ë[my (4) below]í. (Anderson 1992: 296ñ97)
ì í î
(4)
ì-en[N [N X ] [N Y ]] ® [N [N X í ] [N Y ]] î -s-
Such rules are different in kind from normal Word Formation Rules, and their status in the grammar is unclear. Anderson proposes additional evidence from Chinese to support the need for such rules: In any case, Word Formation Rules that specifically apply to compounds are not always semantically empty. Again from Mandarin, there is a rule that takes compounds like kàn jiàn ëlook at-perceive: seeí and infixes de or bu to yield a modal sense. We thus have kan de jian ëcan seeí and kan bu jian ëcannot seeí. This is a rule that applies exclusively to compounds of the form [V [V X ][ V Y ]], again providing evidence for analyzable wordinternal structure. (Anderson 1992: 297) Unfortunately for Andersonís claim, though, Chinese forms like kànjiàn are in fact not compounds at all but rather affixed forms, despite the orthographically motivated label commonly applied to them in the Chinese linguistic literature,5 and the analogical derivation rule that infixes dé or bù/bú into derived resultative verbs can and must be stated according to word shape (insert dé or bù/bú before the final syllable), not by internal structure (Starosta, Kuiper, Wu and Ng 1997). Probably the most important observation to be made about the rule formulated in (3) is that it shows that any resemblance between such ëWord Structure Rulesí and real syntactic phrase structure rules is purely coincidental, as Anderson himself clearly establishes (Anderson 1992: 26ff) in connection with his discussion of Bakerís ënoun incorporationí analysis (Baker 1987). That is, Andersonís rule (1) does not satisfy even the vague and weakened version of the Xbar conventions assumed in Chomskyan constituent structure analyses (Kornai and Pullum 1990, cf. Pullum 1985). This means that any claim that this approach to morphology in any way captures crosscomponent generalizations by using the same kinds of rules in syntax and morphology (cf. e.g., Bralich 1991) is vacuous.
126 ! Stanley Starosta
An Amorphous Analogy Alternative: Anderson maintains that the fact that some compounds are headed requires us to provide them with internal structure: ... it appears there is an argument for assigning internal structure to complex words. Specifically, when we create a compound, the result belongs to some lexical category (not necessarily that of any of its parts), and it fills the same slots as non-compound members of the same category. (Anderson 1992: 294) However, his claim goes through only if there is no alternative way to account for such formations without referring to internal structure. In fact, there is such a way. The alternative was outlined by Starosta in 1988 (Starosta 1988: 60ñ64) and developed independently and in much more detail by Ford, Singh and Martohardjono: The problem lies in an analysis of these ëcompoundí words that identifies among their constituents two bone [sic] fide words. Under our analysis only one of the two constituents so identified is, in fact, a bona fide word; the other inevitably turns out to be a morphological constant, not a variable, that only looks like a word: a suffix, affix or infix that, despite its phonological structure, which is very often identical to that of a bona fide word, has, associated with it, neither the syntactic category nor the plena of semantic uses that characterize the true word. (Ford, Singh and Martohardjono 1997: 57ñ58) In both approaches, the relation between compounds and one of their ëconstituentsí can be stated as a WFS in which the constant term in a set of compounds is treated as an affix: dΩ g ëdogí
:
N aF i bΩæd ëbirdí N bF j
dΩ ghaws ëdoghouseí N +dmcl aF i
:
bΩædhaws ëbirdhouseí N +dmcl bF j
::
;dmcl = ëdomicileí ::
Do Compounds have Internal Structure? ! 127 kΩ æt ëcatí
:
N gF k
kΩ æ thaws ëcathouseí
::
N +dmcl gF k
nΩ t ënutí
:
N dFi
nΩ thaws ënuthouseí
::
N +dmcl dF i
mΩ ki ëmonkeyí
:
N eFm
mΩΩ kihaws :: ëmonkeyhouseí (at a zoo) N +dmcl eFm
Figure 6.7: Compound Formation Analogy: X house
From these examples we can extract a pattern, a ëcompound derivation ruleí for right-headed ëN-N compoundsí ending in -haws: WFS (3) N-house ëcompoundingí6 N aFi
:
N +dmcl aFi
]
:
håws]
that is, given any noun with the semantic representation [aFi], there may be a noun ending in the sequence haws and bearing the semantic features [+dmcl, aFi]. This WFS is of the same type as WFSs that describe conventional lexical derivation processes such as the derivation of adverbs ending in ly] from adjectives. Unlike Chomskyan analyses (including Andersonís), there is one ëoutputí noun and only one ësourceí noun, not two.7 The constant features of word class [N] and meaning [+dmcl] and the constant sequence of phonemes haws that characterize this set of nouns capture the intuition of ëhead of a compoundí that is shown in Chomskyan frameworks by assigning an internal hierarchical constituent structure to the word. The lack of primary stress on the last syllable of such words is just an ordinary property of derivational affixes.
128 ! Stanley Starosta
A similar WFS can be constructed for words ending in beri] such as blackberry and blueberry: WFS (4) Adj-berry noun Adj : aFi
N +brry aFi
]
bèri]
:
;brry = ëberryí
In treating this phenomenon as lexical derivation, the analysis allows for and expects the meanings of some of the derived forms to be unpredictable given the ësourceí form and the WFS. Not all adjectives have corresponding berry] nouns, and not all berry] nouns have corresponding adjectives; thus there is no *grayberry [N] or *puceberry] [N], and no *rasp [Adj] or *gooze [Adj] or *poke [Adj] or *dingle [Adj]. The pattern does account for speakersí ability to extend the analogy and recognize and invent new words, as the founders of the Brownberry Ovens in Oconomowoc, Wisconsin, have done. All the words described by this WFS share the class feature N and the semantic feature [+brry], and end in the phoneme sequence bèri]. These are the properties that hierarchical-morpheme-structure analyses try to account for by making one of the component morphemes of the word the ëheadí. Unlike Andersonís IA analysis (Anderson 1992: 93ff), this approach does not posit feature structures arranged hierarchically inside a matrix. The features of a word are an unordered list, and the features of ëcompoundsí are no exception. The WFS does not assign or refer to any internal bracketing or constituent labeling whatsoever. The new word occupies a normal position in the system of binary lexical contrasts, something that would be awkward or impossible to state in a hierarchicallystructured feature representation.
Irregular Plurals: One argument Anderson advances for the necessity of internal structure for compounds is the fact that a compound containing a noun with an irregular plural will also have an irregular plural, e.g., the plural of scrubwoman is scrubwomen, not *scrubwomans as we might expect if there were no explicit connection between woman and scrubwoman (Anderson 1992: 95). This is of course an argument only if there is no alternative way to account for this morphological parallel without assuming internal structure for compounds. However, there is. The lexicase account of irregular
Do Compounds have Internal Structure? ! 129
plurals is in terms of analogical patterns that refer to shape, meaning, and/or syntactic class. The one for woman: women for example would be: (5) N -plrl
:
N +plrl
umƒ]
:
Imƒ]
This pattern states that corresponding to a singular noun ending in Umƒ, there may be a plural noun ending in Imƒ. Other aspects of the shape are unchanged, including the position of the stress. The rule only applies if the conditions on shape and meaning are met. It refers to the shape of the end of the word, not internal constituent structure, and works just as well for compounds as for non-compounds, i.e., it applies without modification to: (6) woman scrubwoman charwoman congresswoman superwoman clubwoman professional woman
: : : : : : :
women scrubwomen charwomen congresswomen superwomen clubwomen professional women
Again, the wumƒ] and wImƒ] suffixes are unstressed, in contrast to Adj N sequences in e.g., a super w˙man and a professional w˙man. The phrases *a congress w˙man and *a club w˙man donít exist because nouns canít modify nouns in English. The WFS approach to irregular plurals also makes possible a plausible formal analysis of compounds with ëvariable pluralsí, such as those cited in Anderson (1992: 295): (7) A foot flatfoot tenderfoot
: : :
B feet flatfeet tenderfeet
: : :
C (foots) flatfoots tenderfoots
The plurals in column B are formed by an analogical rule that refers to semantic features as well as shape:
130 ! Stanley Starosta
WFS (5)
: i ablaut plural
N -plrl +podl
:
N +plrl +podl
t]
: it]
;podl = podal ëfoot-relatedí
This rule applies to foot because it ends in t and has a semantic representation that refers to feet (represented here by an arbitrary feature +podl). It also applies to tenderfoot and flatfoot if the speakerís lexical representation for these words incorporates a reference to feet. However, if the words undergo a semantic shift in the speakerís lexicon such that the connection with feet (symbolized by the feature [+podl]) is lost, the default plural rule applies: WFS (6) s] plural N -plrl ]
: :
N +plrl s]
This same rule would apply in child language acquisition at the point at which the child has constructed WFS (6) but has not yet figured out WFS (5). It is also possible that a given speaker may never create WFS (5), in which case feet will be stored as an unpaired item and only flatfoots and tenderfoots will be created as plurals of flatfoot and tenderfoot. The identical analysis can be applied to examples from another end of the body: saber tooth: saber tooths, milk tooth: milk teeth (cf. Anderson 1992: 295). The rule relating tooth and teeth (if there is one for a given speaker) applies to words ending in uq] and referring to dentition, and so will apply to the compound milk tooth without any reference to internal structure. Saber tooth on the other hand may not meet the structural description for this rule if the dentition feature is not retained in the derivation of the compound saber tooth tiger and/or the subsequent truncation to saber tooth. WFS (7) uq] : iq] dentition plural N +dntl -plrl uq]
:
:
N +dntl +dntl i q]
Do Compounds have Internal Structure? ! 131
Composites: A second kind of construction Anderson cites in support of his claim for the necessity of word-internal structure is complex forms with internal inflection, such as English men of war, brothers-in-law, passers by and Icelandic place names (Anderson 1992: 295). In other languages with more overt inflection, the inflectional features may be passed down to non-heads as well as heads. Many Icelandic place names are Adjective-Noun compounds for instance, and both parts get inflected for case: cf. Breiði-fjörður ëBroadfirthí, genitive Breiða-fjarðar. Note that this extension of inflection to both parts shows that the first (non-head) part of the compound, as well as the second part is an identifiable structural unit. Anderson considers this form to be a single proper noun, but its two components inflect independently and in concert, exactly like the head noun and modifying adjective in a noun phrase. While such examples certainly do show convincingly that the two components must maintain their independent status, they raise a more basic question: why does Anderson consider this form to be a single word in the first place, when grammatical facts indicate that it isnít? The unstated assumption here is that this must be a single word because of its semantic non-compositionality; that is, Breiði-fjörður is not just any broad firth, as we would expect from the meanings of its apparent components, but rather a particular one at a particular place, known to the participants in the communicative act. This special status is reflected in the capitalization in Icelandic orthography, paralleling English forms such as ëWhite Houseí.
(a) Compounds vs Phrases: The logical problem with this conclusion is that while composite word forms are normally noncompositional, non-compositionality is not a sufficient criterion for regarding a sequence of otherwise independent word forms as a single word. This is shown by the existence of English phrasal ëidiomsí like kick the bucket, keep tabs on, etc. They are regarded as noncompositional but still phrasal, and I would analyze Andersonís English and Icelandic examples in the same way. The internal inflection is then not odd behaviour for compounds at all, but on the contrary standard evidence against proposing a compounding analysis for these forms at all.
132 ! Stanley Starosta
(b) Idioms in Lexicase: The lexicase analysis of idioms regards them in normal dependency grammar fashion as sequences of words related by pairwise unidirectional dependency relations (Starosta 1994). They differ from ordinary phrases in that the components are semantically and syntactically distinct from their non-idiomatic homophonous counterparts. Thus, kick2 and bucket2 in kick2 the bucket2 are not the same kick and bucket as the homophonous words in kick1 a football and empty the bucket1. Instead, kick2 is a word which means ëdieí, and which requires a particular lexical item, bucket2, as its accusative Patient.8 It does not allow any other dependents. bucket2 is a particular lexical item which means ëdeathí, and which requires the article the as its dependent. It also allows for no other dependents. kick2 the bucket2 is thus a phrase differing from kick1 a football not in compositionality, since the meaning ëdieí is extractable from the components kick2 ëdieí and bucket2 ëdeathí, but in the very specific selectional requirements imposed by kick2 and bucket2 on their dependents. (c) Icelandic: The point of this discursus on idioms is that the same analysis used in the lexicase analysis of idioms can also be applied to Andersonís Icelandic place name examples. In (8) and (9), ëBroad Firthí has the semantic and distributional properties of a proper noun because the head noun, fjörður ëFirthë is a proper noun [N,+prpr]. In this formulation, breiði and breiða are marked with the idiomatic feature [BROD], while fjörður and fjarðar are marked lexically with [?[BROD]] to show that they require a dependent bearing this feature. breiði and breiða lack the usual contextual feature [?([Adv]], and so do not allow dependents like ëveryí, while fjörður and fjarðar lack the usual nominal feature [?([Det])] and so allow no determiner dependents such as ëthisí. breiði fjörður and breiða fjarðar are thus grammatically both phrases rather than words, and thus do not constitute counterexamples to the claim that words do not have internal structure. Any given sequence of recognizable forms that constitutes a grammatical unit is either a phrase, composed of words with their own internal syntactic word class, related by pairwise dependency relationships, and having a compositional meaning, or is a single word with no internal structure and no compositional meaning. There is nothing in between, such as the ëZwischendingí that Anderson proposes, and no ad hoc stipulation is required:
Do Compounds have Internal Structure? ! 133
While Word Formation Rules have been argued (here and in Chapter 10) not to create structure in general, they can in particular cases be stipulated to create internal structure. Known cases seem to involve the creation of headed structures in which (internal to the resulting word) only the base form bears a category label. In the Georgian, Icelandic and Russian cases of noncompound composites considered above, it is not clear that there is any motivated syntactic category to which to assign the peripheral element. (Anderson 1992: 303) (8)
fjörður breiði 1ndex Adj BROD Nom
2ndex N +prpr 1 ([Adj]) 1 [BROD] Nom 1 ( Adj ) Nom
ëBroad Firthí (9)
breiða 1ndex Adj BROD Gen
ëof Broad Firthí
fjarðar 2ndex N +prpr 1 ([Adj]) 1 [BROD] Gen 1 ( Adj ) Gen
134 ! Stanley Starosta
(d) In-laws: The phrasal idiom approach could also be the right way to account for apparent cases of internal inflection in English such as sons in law. In such an approach, sons, in, and law would be idiomatic words sons2, in2, and law2 with word-specific selectional restrictions such that son2 and sons2 (and other kinship nouns derived into the same class) would require a single right dependent in2 [P] ëbyí, which in turn would require a single right dependent law2 [N] ëmarriageí, which would allow no dependents of its own. One problem with this analysis, however, is that the stress pattern of these sequences is that of a word rather than of a phrase. An alternative would be to write two compounding rules to produce the singular and plural forms directly from the inflected blood relative nouns: WFS (8) In-law nouns N +knsp -mtrm aplrl bFi ]
:
:
N +knsp +mtrm aplrl bFi Inl ]
; knsp = kinship ; mtrm = matrimony
This rule says that any singular or plural kinship noun may have a singular or plural matrimonial kinship counterpart ending in Inl . The consanguineous kinship term is the ëheadí, since it contributes its semantic features and stress pattern to the compound, while the Inl ] sequence is a desinence associated with the matrimonial kinship feature [+mtrm]. The plural shape carries over in derivation without additional specification. No internal structure is assumed or assigned to the compound. The worrisome part of this analysis is that if it is this easy, why arenít there a lot more of such examples? That is, there is nothing in the WFS analysis so far that accounts for the well-known fact that inflectional word-formation seems to apply to the morphologically least marked form. In fact, forms such as son-in-laws do sometimes occur as the plural of son-in-law in informal speech. This would lend support to an analysis that cannot generate the inflection-internal alternatives, but the WFS strategy does not incorporate this beneficial bias.
Do Compounds have Internal Structure? ! 135
We might compare this single-lexeme derivational analysis with the internal-structure feature-seeping ëWord Structure Rulesí proposed for such examples by Anderson: Furthermore, these inflectional feature(s) can also be passed down to non-initial heads, in the rare cases (in English) where such constituents seem to be motivated: thus sons in law, men of war, passers by. (Anderson 1992: 294ñ95) Unfortunately Anderson gives us no explicit examples of the downward percolation mechanism that accomplishes this result. We thus canít tell whether it actually works, how powerful it is, how it accounts for the marked nature of such patterns, or how it prevents massive over-generation. Pending an explication that is sufficiently generative to compare with the lexicase analyses, the latter, with their flaws, must be considered superior.
ëCompoundsí Containing Bound Morphemes: (a) Pseudocompounds Various languages have forms which look a lot like compounds intuitively, but for which one of the two components does not satisfy the definition of a free form. Examples include Munda ënoun incorporationí, Chinese ëresultative verb compoundsí, and N-X-N ëcompoundsí in Greek, German, English, and other languages. Anderson suggests that such formations might be accounted for by lexical rules, but continues to maintain that they must have internal structures: Since the lexicon describes some irreducibly phrasal structures (in the case of idioms), there is no reason to doubt that internally structured compounds can also be present there, even in cases where one (or more) of the elements contained within the structure are not independently available as lexical items.... (Anderson 1992: 318) Note that the analogy with idioms here is misdrawn. Idioms are syntactic constructions while pseudocompounds are uncontroversially single words, so positing a phrasal analysis for the phrasal examples does not warrant a phrasal analysis for non-phrasal lexical entries.
136 ! Stanley Starosta
This is the case for ëpseudo-compoundsí like Italo-American or tetraethyl, and an additional class of rules of analogy over lexical structures were tentatively suggested to describe the extent to which this class can be semi-productively extended. The properties of such rules remain, at present, a project for future research, but the idea seems more promising than the assumption of independent lexical status for items like Italo-, tetra-, etc. (Anderson 1992: 318) ... a class of problems related to those above is posed by examples such as Sino-Japanese (friendship treaty). Here we certainly do not want to claim that Sino- is a word (or that the parts of productive technical compounds erythromycin, etc., are either) but these elements still seem to turn up in newly formed words. The alternative of saying that there is a Word Formation Rule of ëSino-prefixingí or the like also seems thoroughly unpalatable. Neither of these moves is necessary, however, assuming that we have some compounds in our lexicon whose parts do not occur independently. These compounds can serve as the foundation from which others can be formed analogically. (Anderson 1992: 298) If the alternative to a rule of ëSino-prefixingí is increasing the power of the theory to accommodate these otherwise indigestible examples, the amorphous Sino-prefixing analysis, which is in fact the one made available by the constrained lexicase theory, does not seem to me all that ëunpalatableí.9 Such rules of pseudo-compounding have been used in lexicase analyses of Sora ënoun incorporationí (Starosta 1989ñ 90), Micronesian ënoun incorporationí (Starosta 1998) and Chinese ëresultative compoundsí (Ng and Starosta 1996 and Starosta et al. 1997). The Sino-prefixation WFS will illustrate the mechanism. WFS (9) Sino-prefixing Adj +ntnl aFi
[
:
:
Adj +ntnl +chns +bltr aFi [sayno
;ntnl = national ;chns = Chinese ;bltr = bilateral
that is, corresponding to an adjective referring to some national characteristic [+ntnl, aFi], there may be another adjective that has
Do Compounds have Internal Structure? ! 137
the same shape except for an additional initial sequence [sayno, and that refers to the same national characteristic augmented by a bilateral Chinese component [+ntnl, aFi, +chns, +bltr]. This WFS is an analogical formula constructed along the lines Anderson suggests. It is just a summary of the pairwise relations we can find in a speakerís internalized lexicon at any given time. The most important point from a theoretical perspective is that there is no separate lexical entry Sino- in this analysis, and no reference to internal structure is required. Similar rules can be constructed for Italo-, tetra-, etc. It might seem extravagant to have a separate WFS for each such formulation, and an equally explicit but more economical alternative would of course always be preferable. So far, I donít know of any. The proposal just made involves assigning the sort of internal structure that appears within compounds to forms where the constituents are not independently attested words. Let us then call the larger class of words with internal structure composites, with those cases formed directly by the Word Structure Rules from independent words constituting the special subclass of compounds as standardly conceived. (Anderson 1992: 299) Given the lexicase amorphous analysis just illustrated, there is no need to extend the morphological theory to allow for this new enlarged class of ëcompositesí or for the class of ëforms where the constituents are not independently attested wordsí. (b) Cranberries Because analogical rules of compounding are non-directional, they can be used in ëback formationí, to extract ëionsí (Chao 1968: 159) which do not exist synchronically in the lexicon, and sometimes never did, e.g., WFS (4) Adj-berry noun Adj aFi ]
:
:
N +brry aFi béri]
This rule could be used to analogically extract a non-occurring free form *cran:
138 ! Stanley Starosta
(10) analogical cran extraction blúberi ëblueberryí ëstrΩ beri ëstrawberryí kránberi ëcranberryí
:
blú ëblueí strΩ ëstrawí ??
: :
:: ::
As far as I know, cran, huckle, dingle, etc., are not used as free words, though cran has been analogically extracted for use in the product name cranapple. However, it is in principle possible for such ëradicalsí to become free molecules, as shown by the case of pineapple. In Hawaiíi, pine is often used to refer to pineapple, by loose analogy with other compounds, e.g., (11) analogical pine extraction crabapple thornapple custard apple horse apple pineapple
: : : : :
crab thorn custard horse ??
:: :: :: ::
This analogical analysis may be contrasted with Andersonís more conventional analysis, in which sequences such as cran- are identified as ëbound morphemesí with their own syntactic class designation, even though they cannot really have a syntactic class since they have no syntactic distribution: Naturally, many compounds are not created syntactically but rather reside in the lexicon (where their idiosyncrasies of sense, etc., are stored). Some of these lexical compounds will contain bound forms of the familiar cranberry morph type. Note that we do not have to say that the lexicon contains a distinct item cran- so long as it contains [N [N cran] [N berry]] with the appropriate sense of the whole. (Anderson 1992: 297) Often the two items that are brought together in a compound also appear as independent words in the language, as in English truckdriver, skeet-shoot (clay pigeon shoot), open-door (policy). In some cases, only one of the two appears as an independent word: the classic example of this situation is the name of berries in English, such as raspberry, huckleberry, loganberry, cranberry. In some cases, both
Do Compounds have Internal Structure? ! 139
members of a compound are elements that only occur combined with others: e.g., chipmunk, mushroom, somersault.10 Frequently the sense of an element in a compound seems to have nothing to do with the sense of a (formally similar) element that appears elsewhere: broadcast, blacksmith, hotdog, strawberry. We will consider such formations to be compounds (as opposed to stem modifications) even when one (or both) of the elements involved does not occur freely, so long as they form a part of a structural pattern based on open ended word classes. That is, where a word formation pattern involves two or more such members of open classes, it will be treated as compounding;.... (Anderson 1985: 40) Andersonís reasoning here, especially in the second case, seems quite incomprehensible. If a compound is defined as a word composed of two free forms, how can he apply the term to situations containing only one free form, or no free forms at all? I am reminded of James McCawleyís tongue-in-cheek proposal that the word mother be analyzed as containing the morphemes moth and -er. The lexicase analysis described above of course contains neither a separate cran- entry nor words containing cran- as a labeled bound morpheme, yet still allows cran to be identified and extracted by a simple analogical process. (c) -ceive : -ception Another classic problem for IA morphology, and argument for the necessity of internal structure in compounds, involves pairs of verbs and related nouns ending in siv and sepš n, e.g.,: (12) receive conceive perceive deceive
: : : :
reception conception perception deception
On the same basis, we can say that English prefix-stem combinations are present in the lexicon in structurally analyzed form, but the individual prefixes and stems are not. This is another case in which structure has to be accessible somehow, so that we can identify the -ceive of receive, deceive, etc., as the same in each instance (in order for Word Formation Rules that apply to these verbs to get the special allomorphy right in reception, deception, etc.). (Anderson 1992: 297)
140 ! Stanley Starosta
However, it is easy to write an analogical rule which accounts for this relation without referring to internal structure or requiring the counterintuitive recognition of a bound morpheme CEIVE with allomorphs /siv/ and /sep/: WFS (10) ceive] and ception] V aFi s˘v]
:
N +bstr aFi
: sΩepš n]
that is, verbs ending in the sequence s˘v may have a corresponding abstract noun ending in sΩepš n. Because this is a derivation rule, and because derivation is typically non-productive, we may expect to find exceptions, e.g., (13) *inceive *exceive *interceive *contraceive *transceive undeceive
: : : : : :
inception exception interception contraception *transception *undeception
: : : : : :
*inceiver *exceiver * interceiver *contraceiver transceiver *undeceiver
Because WFS (10) refers only to the syntactic class, semantic features, and phonological shape of the end of a word, Andersonís claim that ëstructure has to be accessibleí to account for such examples is shown to be invalid.
Andersonian Analogy: The analogical approach to ëpseudocompoundingí I have just illustrated can be compared to the one Anderson mentions as an alternative to the Chomskyan constituent structure rule approach, and the one which he adopts elsewhere in his chapter: The same kind of internal structure can be extended to items already in the lexicon like English pseudo-compounds based on bound combining forms (Sino-Japanese, erythromycin, etc.); and also for prefix-stem combinations (receive, perceive, etc.). This account also entails the existence of rules of analogy or lexical intersubstitution which yield new internally complex lexical items built on the same pattern when necessary. These mechanisms are
Do Compounds have Internal Structure? ! 141
presumably lexical rather than syntactic, and there is no reason to believe the syntax cares one way or another about the differences in formation between these words and others. (Anderson 1992: 299) On the basis of these lexicalized compound forms, we can suggest that languages have rules of analogical compound formation. That is, given the compounds [N [N X1][N Y]] and [N [N X2] [N Z]], such a rule provides the license for coining [N [N X2] [N Y]]. The semantics of such words is presumably arrived at by a sort of ëtriangulationí from what we know about the meaning of the parts and the wholes that we already have. (Anderson 1992: 297) ... the theory admits a class of structured compounds, a class of Word Formation Rules that can refer to this structure, and a class of rules that form compounds analogically on the basis of other compounds. (Anderson 1992: 298) This general IA approach is quite similar in strategy to the one I have just outlined and exemplified, but totally different in the assumption that internal hierarchical structure must be assumed for compound words. If it were developed, I think Anderson would find, as I have, that the ëclass of structured compoundsí and the ëclass of Word Formation Rules that can refer to this structureí were unnecessary, and that a single ëclass of rules that form compounds analogically on the basis of other compoundsí suffices for all descriptive and explanatory purposes.
Phonological Structure: In a pure amorphous morphological analysis, the only word-internal structure that words have is phonological dependency structure. As in syntax, any general pattern of canonical syllable or word structure can be extracted from the aggregate of individual pairwise segment-to-segment dependencies. New words that are added to a language will normally adapt themselves to this pattern immediately or gradually, though if enough phonologically exceptional items are added to the lexicon, it may swing the statistical balance and make a formerly marked pattern an unmarked one. Lexicase analogical derivation rules that create compounds add words to the lexicon, and such words may not be in conformity with
142 ! Stanley Starosta
statistically normal patterns of phonological dependency. However, it does not follow from this that compounds need to be assigned internal morphological structure to account for this exceptional patterning. Rather, we only need to say that some of the phonological dependency relationships resulting from the creation of these new words are statistically abnormal. Thus, in the classic examples of nitrate, night rate, and nye trait, nitrate fits the canonical English syllable and word patterns while night rate and nye trait deviate from them. This distinction can be accounted for in terms of which segments depend on which other preceding or following segments, and there is no necessity for referring to word-internal morphological boundaries to do this. Over the course of time, such marked forms may gradually adapt to the statistical norm, sometimes to the point where their etymological compound source is no longer synchronically identifiable, as in the case of English *housewife > hussy, *boat swain > bosun, and *wind eye > window.
Conclusion Headedness Andersonís final conclusion regarding the internal headedness of compounds is appropriately weak and circumscribed: From these considerations, we conclude that at least some words apparently have a structure which is not purely phonological in character; and furthermore, that a notion of head is applicable to some (perhaps most plausibly, to at most one) of the overt constituents of a word. This structure is relevant to the morphology, in that it can serve as the basis for ëtransmittingí the marking of a morphosyntactic property from an entire element to one of its subconstituents. (Anderson 1992: 295ñ96, emphasis added) I have attempted to show that this conclusion is not warranted, even in such a wishy-washy form, because in each case cited by Anderson, an alternative analysis is available which accounts for the same data more explicitly and without any reference to internal non-phonological structure or boundaries. Finally, it is also worth noting that Anderson does not tell us very specifically how he thinks that morphosyntactic properties are
Do Compounds have Internal Structure? ! 143
ëmarkedí in a lexical element. He sometimes speaks of listing irregular ëstemsí as distinct subparts of a lexical entry (Anderson 1992: 133ñ34), but he also speaks of ëmorphosyntactic features (minimally, perhaps, indications of word class) that restrict the range of Morphosyntactic Representations in syntactic structure that they can interpret.í If this refers to some kind of arbitrary ërule featureí that triggers a particular morphological rule (cf. Lakoff 1970), as suggested by his reference to word class specifications, this kind of stipulation is of course the worst possible kind of non-explanatory analysis. Because of the inclusion of internal-structure analyses, Andersonís morphological system has been made significantly more powerful, and the claims about the nature of human language correspondingly weakened: The point of these remarks is not at all to present a complete theory of Word Formation Rules applying to compounds, or of analogical rules of compound formation, since no such theory has been developed in any comprehensive form at present. Rather, we mean simply to suggest that word-based morphology can survive a variety of problematic examples that might seem to motivate the assumption that word-formation in general creates structure.... (Anderson 1992: 297ñ98) But in fact Andersonís original analysis has not survived. By allowing the lexicon to contain stems rather than words, and to assign internal structure to compounds, he gives away the store. The morphological analysis Anderson arrives at after this bout with compounding is not significantly less powerful than the old IA/IP theories. By contrast, in finding ways to account for Andersonís problematic examples without positing abstract stems or internal word structure, lexicase dons a white hat and rides to the rescue of a pure and constrained word and paradigm analysis in distress.
Notes 1. I would like to thank Laurie Reid for comments on an earlier version of this paper.
144 ! Stanley Starosta 2. See Matthews 1974 and Robins 1959 for comparisons of these three approaches to word structure. 3. ëWP can be formalized just as fully as IA or IPí (Robins 1959: 19). ëWP when reworked in terms of current formal criteria deserves proper consideration as a means of stating and analyzing grammatical systemsí (Robins 1959: 144). 4. As more data are adduced, the rule will evolve into something like the following form: [+past] : [-past] [X -cor t] -voi
:
[X -cor ] -voi
5. Anderson is not to blame for this error. Rather, it is the result of deliberate terminological obfuscation by Y.R. Chao: Instead of requiring the constituents of a [Chinese] compound to be free words, we shall take compounds in a wide sense so as to include those in which the parts are bound morphemes other than affixes. (Chao 1968: 359)
6.
7.
8.
9.
The term ëcompoundí as used by Sinologists represents a rather broader concept. Practically any word written with two or more characters is a compound in this sense. (Chao 1968: 359) For recent examples of this conventional but incorrect compounding analysis of Chinese polysyllabic words, see the papers in Packard 1997 (other than Starosta et al.) which have the word ëcompoundí in their titles. The term ëcompoundí in a seamless analysis should be understood to refer to a derived word in which the derivational desinence is recognizably related to a full word. When this identification does not correspond to a valid historical etymology, it is a FOLK ETYMOLOGY, but still counts as a compound in the seamless sense. Contrast Andersonís 1985 characterization of compounding rules: ëThe formal structure [of compounds] can be characterized in terms of (a) the elements which are compounded; (b) the manner in which they are joined; and (c) the category of the resulting compoundí (Anderson 1985: 46). Cf. OíGradyís ëContinuity Constraintí: ëAn idiomís component parts must form a chain.í (OíGrady 1998: 284); ë ... the proposed analysis reduces idioms to a continuous chain of head-to-head relationships. In the prototypical case, the terms in head-to-head relationships are specific wordsí (OíGrady 1998: 285). ëPalatabilityí has no status in the comparison of scientific theories, except perhaps as a rough heuristic. An analysis made available by a constrained theory may sometimes go against approaches we are accustomed to, and that is as it should be: scientific investigation should not be led by scientifically unmotivated traditions and folk beliefs.
Do Compounds have Internal Structure? ! 145 10. Words such as woodchuck, chipmunk, mushroom, and somersault are etymologically just folk-analogized versions of words borrowed from other languages (woodchuck from Algonquian (as Cree otchek, Chippewa otchig, chipmunk perhaps from Ojibwa ajidamoon? ëred squirrelí, mushroom ultimately from Late Latin mussirio, somersault/summersault from Latin supra ëoverí + ¶altus ëa leapí). They are regarded as compounds presumably because their parts have a superficial resemblance to free words, which is exactly why other better pedigreed words are so regarded, e.g., doghouse, water closet.
References Anderson, Stephen R. 1985. ëTypological distinctions in word-formationí. In Language Typology and Syntactic description III: Grammatical Categories and the Lexicon, ed. by Timothy Shopen, pp. 3ñ56. Cambridge: Cambridge University Press. óóó. 1992. A-morphous Morphology. Cambridge: Cambridge University Press. Baker, Mark C. 1987. Incorporation: A Theory of Grammatical Function Changing. Chicago: The University of Chicago Press. Bralich, Philip. 1991. X-bar Theory and Morphological Juncture. Ph.D. dissertation. Honolulu: University of Hawaiíi. Chao, Yuen Ren. 1968. A Grammar of Spoken Chinese. Berkeley: University of California Press. Ford, Alan, Rajendra Singh and Gita Martohardjono. 1997. Pace Påƒini: Towards a Word-based Theory of Morphology. (American University Studies Series XIII). Linguistics, 34. New York: Peter Lang. Hendrick, Randall. 1995. ëMorphosyntaxí. In Government and Binding Theory and the Minimalist Program, ed. by Gert Webelhuth, pp. 297ñ349. Oxford: Blackwell. Hockett, Charles F. 1954. ëTwo models of grammatical descriptioní. Word. 10: 210ñ34. Kornai, Andras and Geoffrey Pullum. 1990. ëThe X-bar theory of phrase structureí. Language. 66(1): 24ñ50. Lakoff, George. 1970. Irregularity in Syntax. New York: Holt Rinehart and Winston. Lieber, Rochelle A. 1981. On the Organization of the Lexicon. Bloomington, Indiana: Indiana University Linguistics Club. Matthews, P.H. 1974. Morphology: An Introduction to the Theory of Word Structure. Cambridge: Cambridge University Press.
146 ! Stanley Starosta McCarthy, John J. 1981. ëA prosodic theory of non-concatenative morphologyí. Linguistic Inquiry. 12: 373ñ418. Ng, Siew Ai and Stanley Starosta. 1996. ëCompounding confusion in Chinese word formationí. In Proceedings of the 7th North American Conference on Chinese Linguistics and 5th International Conference on Chinese Linguistics Proceedings (NACCL and ICCL), ed. by Tsai-Fa Cheng, Yafei Li and Hongming Zhang. Vol. II: Discourse, Historical Linguistics, Morphology, Phonology and Phonetics. Los Angeles: GSIL Publications, Department of Linguistics, University of Southern California. pp. 307ñ24. OíGrady, William. 1998. ëThe syntax of idiomsí. Natural Language and Linguistic Theory. 16: 279ñ312. Packard, Jerome L. (ed.). 1997. ëNew approaches to Chinese word formation: Morphology, phonology and the lexicon in Modern and Ancient Chineseí. In Trends in Linguistics. Studies and Monographs. Berlin: Mouton de Gruyter. Pullum, Geoffrey K. 1985. ëAssuming some version of X-bar theoryí. Proceedings of the Chicago Linguistic Society. 14: 323ñ53. Robins, R.H. 1959. ëIn defence of WPí. Transactions of the Philological Society: 116ñ44. Reprinted in R.H. Robins, Diversions of Bloomsbury: Selected Writings on Linguistics. Amsterdam: North-Holland. Selkirk, Elizabeth. 1982. The Syntax of Words. Linguistic Inquiry Monograph Seven. Cambridge, Mass.: MIT Press. Starosta, Stanley. 1971. ëDerivation and case in Sora verbsí. Indian Linguistics. 32(3): 194ñ206. óóó. 1988. The Case for Lexicase: An outline of Lexicase Grammatical theory. Open Linguistics Series. London: Pinter Publishers. óóó. 1989ñ90. ëSora combining forms and pseudo-compoundingí. In Proceedings of the Symposium on Austro-Asiatic Languages, Mon-Khmer Studies XVIII-IX, ed. by David Thomas, pp. 77ñ105. óóó. 1994. ëIdioms in lexicase: A first approximation. Class handout c.970, Linguistics 422. Honolulu: Department of Linguistics, University of Hawaiíi. óóó. 1998. ëA seamless analysis of Micronesian noun incorporationí. Plenary presentation, fifth annual meeting, Austronesian Formal Linguistics Association. Honolulu: Department of Linguistics, University of Hawaiíi. Starosta, Stanley, Koenraad Kuiper, Zhi-qian Wu, and Siew Ai Ng. 1998. ëOn defining the Chinese compound word: Headedness in Chinese compounding and Chinese VR compoundsí. In New Approaches to Chinese Word Formation: Morphology, Phonology and the Lexicon in Modern and Ancient Chinese, ed. by Jerome L. Packard, pp. 347ñ70. Trends in Linguistics. Studies and Monographs. Berlin: Mouton de Gruyter. Starosta, Stanley and Siew Ai Ng. 1995. ëCompounding confusion in Chinese word formationí. Joint meeting, 4th Annual International
Do Compounds have Internal Structure? ! 147 Conference on Chinese Linguistics (ICCL-4) and 7th North American Conference on Chinese Linguistics (NACCL-7), 27ñ30 June 1995. Madison, Wisconsin: The University of Wisconsin-Madison. Williams, Edwin. 1981. ëOn the notions ìlexically relatedî and ìhead of a wordîí. Linguistic Inquiry. 12: 245ñ74. Zvaig, Nava. 1991. ëLexicase Formalization of Hebrew Inflectionsí. Term paper, Linguistics 640. Honolulu: University of Hawaiíi.
148 ! Stanley Starosta
7 Micronesian Noun Incorporation: A Seamless Analysis1 Stanley Starosta Introduction: Sora Noun Incorporation It is a well-known fact that verbs like nouns. They are said to ëtakeí complements, and sometimes they seem to ëtakeí the complement nouns right inside their own bodies, literally incorporating them. The result is a verb that is still a single word, and still a verb, but which is longer and has more to tell us than its non-incorporating counterpart. Figure 7.1 provides a few examples from Sora, a Munda language spoken in eastern India (Starosta 1967), to illustrate the phenomenon. Note that for each ëincorporatingí verb, there is a shorter verb that bears a partial semantic and phonological similarity to it, and a noun that is related to it semantically and usually phonologically as well.
Noun Incorporation: General Syntactic Analyses ëNoun Incorporationí constructions turn up in many languages in many language families. In American linguistics, including Chomskyan and typologically oriented varieties, there are two fairly clearly distinguishable positions on noun incorporation (ëNIí) to be seen, a syntactic (transformational) approach and a lexical one. As an example, in a transformational analysis, the Sora verb gg bb a ëMake him/her sit on a buffalo!í would be derived from a full sentences such as anin ad b tel n ak ndu le n gg ba ëMake him sit on the buffaloís backí.
Micronesian Noun Incorporation ! 149 Incorporating verb gg bb
a
gg b a
Gloss
Unincorporated words
Gloss
ëMake him/her sit on a buffalo!í
g ba
ëSit (on something)!í
b tel n g ba
ëbuffaloí ëSit (on something)!í
b tel n gua su u n gua re n ama kinan ama kinan gara
ëbuffaloí ëPlant (it)!í, ëErect (it)!í ëhouseí ëPlant (it)!í, ëErect it!í ëstoneí ëTake (something)!í ëtigerí ëSeize (something)!í ëtigerí ëAsk (someone for something)!í ëfecesí ëGet (something)!í ëwomaní
ëInitiate him (by seating him on a buffalo)!í
gusi te
ëbuilds a houseí
gu rnete amkidte
ëthere is a funeral ceremonyí ëbe taken by a tigerí
amkidten
ëtakes a tigerí
gars n ne
ëto ask us for anythingí (rude)
a bo l i
ëthey got marriedí
s a a nsel n
Figure 7.1: Sora Noun Incorporation
The most prominent and enthusiastic proponents of the idea that noun incorporation must be accounted for in terms of syntactic rules that create lexical structures are probably Jerrold Sadock (the ëautolexical approachí, e.g., Sadock 1986) and Mark Baker (the incorporation analysis, Baker 1988). It has been suggested, most notably by Baker (1988a: Chapter 3) and by Sadock 1980, 1985a, 1986, that noun incorporation is a syntactic rule that realizes the head of the direct object noun phrase or the head of the subject of an unaccusative verb within the verbal complex, either by movement (Baker) or by co-analysis (Sadock). The result is a morphologically complex verb, containing a noun root that is linked to the direct object position (either by a trace relation or by the disjunction of the syntactic and morphological analyses). (Rosen 1989: 295) This approach is also the one adopted in Hendrickís chapter on morphology in Webelhuthís book on the Minimalist Programme: ëLet us, along with Baker (1988a) and Sadock (1991), take noun
150 ! Stanley Starosta
incorporation to be a phenomenon where a noun enters into a compound with another formative and yet remains syntactically activeí (Hendrick 1995: 332). Transformational analyses of word-formation always fail. Their remaining proponents havenít yet realized this, even after decades of conclusive counter-evidence, because powerful frameworks like Chomskyan transformational grammar and Jerrold Sadockís Autolexical Syntax are not generative; that is, they are never precisely formalized, so the fatal flaws never rise to the surface to confront their perpetrators. Some of the reasons why transformations wonít work are illustrated in the Sora examples above. They are: (1) excessive metatheoretical power, (2) variable productivity, (3) semantic unpredictability, and (4) phonological unpredictability. (1) Excessive metatheoretical power A grammatical framework that allows transformations to construct lexical items is a very powerful one. Thus, if one were to claim that gg bb a should rather be derived from anin ad b tel n asa kale n gg ba ëMake him sit on a buffaloís neckí or anin ad b tel n asambile n gg ba ëMake him sit on a buffaloís buttocksí or anin ad b tel n amadule n gg ba ëMake him sit on a buffaloís corpseí, or maybe even anin ad b tel n a[e]Nle n gg ba ëMake him sit on a buffaloís [e]Ní, there is no linguistic basis for preferring one analysis over the other. It is hard to imagine how such an approach could be falsified, and a ëtheoryí that canít be falsified even in principle has no empirical content. It is scientifically uninteresting. (2) Variable productivity A transformation should apply to every structure that meets its structural description, and the result should be completely predictable from the input structure. However, not every transitive verb combines with every noun to form an actual or possible verb. Thus while gars d a ëDonít request in an irritating way!í is a recognized word, I think it would be found that gars dd a ëDonít request a dogí is not a recognized word, though it is a possible one. (3) Semantic unpredictability A similar problem arises with meaning in transformational analyses of incorporation. Incorporated forms are regarded as ëcompoundsí, and compounds are prototypically ënon-compositionalí.
Micronesian Noun Incorporation ! 151 Incorporating verb
Gloss
ëIncorporatedí substrings
Gloss
gusi te gu rt i
ëbuilds a houseí ëthey have a funeral ceremonyí
gu.. ..si ..
ëplantí, ëerectí ëhouseí
amkidten amkidte gars te
ëis taken by a tigerí ëtakes a tigerí ëasks forí
.. r.. am.. ..kid.. gar.. ..s .. a .. ..bo .. ..g b.., ..g ..
ëstoneí ëseizeí ëtigerí ërequestí ëfecesí ëacquireí ëwomaní ësití
..b .., .. ..
ëbuffaloí
a bojl i
ëthey got marriedí
gg bb te
ëcauses to sit on a buffaloí ëinitiatesí
gg b te
Figure 7.2: Free Forms and ëCombining Formsí of Sora Nouns and Verbs
One of the interesting things about a compound is that you cannot always tell by the words it contains what the compound means. The meaning of a compound is not always the sum of the meanings of its parts. A blackboard may be green or white. (Fromkin and Rodman 1988: 137) This applies to noun incorporation as well. The meaning of the ëincorporatingí form may deviate from the composite meaning of its parts, and the meanings of two words may differ unpredictably even when the supposed component words are the same ones. To summarize, the properties of ëincorporationí that make this phenomenon non-amenable to a transformational analysis then include (1) excessive metatheoretical power, (2) variable productivity, (3) semantic unpredictability and (4) phonological unpredictability. A first-year linguistics student will recognize at least (2)ñ(4) as the characteristics of lexical derivation, and a second-year student should have learned in addition about the dangers of excessive expressive power. (4) Phonological unpredictability Transformations canít do morphology except for the very simplest cases, because the output of a transformation is supposed to be completely predictable from the input. However, if we compare a noun with its word-internal counterpart in the Sora examples, we
152 ! Stanley Starosta
find that there is no generalization about how to derive one from the other.2 A serious (generative) transformational solution would have to write a separate rule for each pair. Baker at least recognizes that a transformational analysis of morphology has promises to keep: The view which I will adopt is similar to that of Sadock (1985) and especially Marantz: I claim that morphology is in effect another subtheory, roughly on a par with the established subtheories of principles of government-binding theory enumerated in 2.1.3. As such, ëMorphology Theoryí (as we may call it) can be characterized as the theory of what happens when a complex structure of the form [Z0 X + Y] is created .... Morphology theoryís responsibility is twofold: first, it determines whether a structure dominated by an X0 level category is grammatical or not in a given language; second, if the structure is well-formed it assigns it a phonological shape [italics mine]. (Baker 1988: 68ñ69) However, as far as I can determine from his book, Baker has no clue about the kinds of problems of principle and practice that he would face if he were ever to try to make good on this promise.
Word-based Analyses vs Morpheme-based IA/IP Analyses The indications are clear then that ëincorporationí is not a syntactic phenomenon, but can only be a lexical one. That still leaves an important choice to be made, however: ëTo put the question in a current and popular formulation, the issue at stake is whether the word is the syntactic ìatomî or whether a smaller unit, the morpheme, isí (Hendrick 1995: 300). Unincorporated noun
ëIncorporatedí noun
Gloss
b tel n b tel n su u n re n kinan s nsel n
..b .. .. 3.. ..si .. .. r.. ..kid.. ..s .. ..bo ..
ëbuffaloí ëbuffaloí ëhouseí ëstoneí ëtigerí ëfecesí ëwomaní
Figure 7.3: Full Forms and ëCombining Formsí of Sora Nouns
Micronesian Noun Incorporation ! 153
In the context of this paper, that means, (1) are we going to divide words into meaningful bits of pronunciation by vertical and/or horizontal slicing, using classical structuralist IA (item-and-arrangement) or IP (item-and-process) procedures or their modern avatars or their Påƒinian antecedents? or (2) are we going to account for shape-meaning correspondences between sets of words in the lexicon non-surgically, using analogical formulae rather than segmentation and classification? Option (1) assumes that a word is composed of separate MORPHEMES, bounded bits of sound-plus-meaning jammed together like pieces of a jigsaw puzzle,4 while option (2) does not. Instead, it treats a word form as a seamless unit delimited by word boundaries and composed of hierarchically organized phonological segments, but containing no internal grammatically determined boundaries. Option (2) is embraced nominally by the ëa-morphousí morphologists, but they have found compounds the biggest obstacle to full implementation of the plan. In fact, it almost appears as if the wouldbe constrainers have given up without a fight. Stephen Andersonís position on compounds for example amounts to a statement of capitulation and an unconvincing attempt to salvage a few fragments of the original brave banner of ëa-morphousityí from the ashes of defeat: So does all this mean that we have abandoned the claim of Chapter 10 that words do not have internal non-phonetic (or nonphonological) structure? Yes and no. We do, apparently, have to recognize the possibility of such structure, but this does not entail the further claim that morphological rules in general should create it. Word Formation Rules do not build structure, that is, unless explicitly stipulated to do so. And the motivation for word-internal non-phonological structure is not the mere fact that some Word Formation Rule has applied in creating the word, but rather the fact that the structure in question is referred to by a rule of the morphology (or of the syntax, if feature percolation within compounds is properly described in that part of the grammar). (Anderson 1992: 298) Clearly, if a true word-based seamless morphology is the goal, the question of the supposed internal structure of compounds must be resolved. This has been attempted in Singh and Dasgupta (1999),
154 ! Stanley Starosta
and I hope to contribute a more detailed syntactic dimension to the solution of the compounding problem in general in Starosta (in this volume) and specifically with respect to noun incorporation in the present paper.
Lexical Analyses Compounding: Alternative (1), the morpheme-based approach to compounding, has a long tradition, beginning with Sapir and Bloomfield and continuing on into modern Chomskyan and theoretically agnostic typological approaches. A number of researchers defend the point of view that noun incorporation is non-syntactic. Di Sciullo and Williams (1987), Mithun (1984, 1986) and Rosen (1989) all view noun incorporation as a lexical process. Rosenís analysis is perhaps the most carefully worked out alternative to a syntactic treatment of noun incorporation. (Hendrick 1995: 333) In these morpheme-based lexical approaches to Noun Incorporation, the phenomenon is always regarded as a kind of compounding: A special case of compound formation is incorporation (usually by a verb, of its object or intransitive subject). (Anderson 1985: 6) An alternative to the syntactic approach to NI is to posit that the complex verb is derived lexically, by a word-formation process similar to compounding. Mithun (1984, 1986a) and Di Sciullo & Williams (1987: 64ñ68) suggest that NI is a lexical process. (Rosen 1989: 295) Almost all the researchers assuming a lexical analysis of NI verbs regard the lexical word-formation process as one that takes two free words as input (see Ng and Starosta 1996 for a list of citations to this effect from the literature): Noun Incorporation (NI) is a process whereby nouns combine with verbs to produce a complex verb, as in the Onondaga sentence in lb and the Niuean sentence in 2b. Sentences like 1b and 2b have nonincorporated counterparts, as illustrated in 1a and 2a. (Rosen 1989: 294)
Micronesian Noun Incorporation ! 155
Problems for a Compounding Analysis (a) Components bound; shape unpredictable This raises a serious conceptual problem, though. In cases of NI, either the verbal or the nominal part may turn out to not be a free word. In such cases, the bound parts are called STEMS: ëIn N[oun] Incorporation, as commonly understood since Sapir (1911), a noun stem is compounded with a verb stem to yield a more specific, derived verb stemí (Mithun 1986: 32). If ëstemsí donít appear as free words, however, they are not words by the standard Bloomfieldian definition (a word is a minimal free form). Syntactic and lexical analyses of these forms as ëcompoundsí in the classical sense then are wrong.5 An alternative definition of compounds in terms of ëstemsí rather than ëfree formsí in effect loses the distinction between compounds and other kinds of complex words and invalidates the analyses which continue to maintain this distinction.
The Seamless Approach: From the seamless point of view, the compounding analysis of Noun Incorporation is simply incoherent. The ëincorporated nouní and/or the ëverb stemí is a substring of phonemes that may mark the ëincorporatingí verb as etymologically related to a free noun and/or a free verb. Since they are bound rather than free, they have no syntactic distribution and thus have no grammatical class. The reason why these substrings often differ in shape and meaning from full nouns, why they have no syntactic dependents, why they lack the inflectional properties of nouns and verbs, and why the ënounsí donít have discourse referents,6 is that they are not nouns or verbs at all, and thus are not characterized by the lists of grammatical properties that characterize forms belonging to the class of nouns or verbs. I will try to show below that data from NI in Micronesian languages are not only consistent with this analysis, but lend fairly strong support to it. The ëseamless morphologyí analysis developed independently in Montréal (Ford and Singh and their students and colleagues) and Hawaiíi (Starosta and his students and colleagues) selects option (2). It attempts to be a true word-based approach to word-formation, in which perceived relationships among words are accounted for by analogical rules called Word Formation Strategies (WFSs). In a lexicase analysis, a WFS has the following shape:
156 ! Stanley Starosta
(1) aFi bFj [..a]
:
aFi gFk : [..b]
A WFS is an analogical formula describing the similarities and differences between pairs of words. Here [aFi] represents the semantic and distributional features shared by the pairs of words, and [bFj] and [gFk] to the features that differ between them. Similarly, [..] refers to whatever elements of phonological shape are shared by the pairs of words, and [a] and [b] to whichever segmental and/or nonsegmental phonological properties are unique to the items on the left of the hyphen or to the items to the right. Note that WFSs do not create or refer to word-internal boundaries or hierarchical structure, though they may refer to syllable structure. Unlike transformations, such derivation rules are typically not productive. They do not generate a set of words, but rather state an analogical relation between pairs of words which already exist in the lexicon. They facilitate the addition of new members as needed and the recognition of members of the set not previously encountered. From an IA point of view, ë... the words as well as the morphemes must be listed in our dictionaries. The morphological rules also are in the grammar, revealing the relations between words and providing the means for forming new wordsí (Fromkin and Rodman 1988: 138). From the seamless point of view, what is listed is not words, morphemes and rules, but rather just words and WFSs extracted from the words. (a) Constraints The constraints imposed on WFSs in both the Montréal and lexicase approaches include the following: (i) Words have no internal non-phonological segmental or hierarchical structure, and no internal boundaries. (ii) Syntactic rules do not manipulate words internally in any way.7 These restrictions may seem to be negative but we hold the view that such is the nature of any constraint. This should not come as a surprise to anyone. The only way to demonstrate that we do not need the excluded notions is by induction based on the
Micronesian Noun Incorporation ! 157
analyses presented here. We claim that there is no need for the outlawed restrictions. The burden of proof is, clearly, on those who want to dispense with these constraints and introduce additional devices to account for the facts. (Ford, Singh and Martohardjono 1997: 3ñ4) Non-compositional morphology poses a bold and serious threat to the central dogma in morphological theory. Being the more restrictive hypothesis, in the absence of evidence against it, noncompositional morphology is the one that we must adopt. As far as I know, the challenge posed by this theory has not been addressed satisfactorily in the literature. (Mohanan 1996: 141) In the lexicase implementation of this theory, the syntactic component is comparably constrained, limiting grammatical representations to a single level of analysis composed of integral and pronounceable words which are linked to each other grammatically by unidirectional dependency relations and which appear in their ësurfaceí order. There are no empty categories and no ëfunctional projectionsí of empty categories. All grammatical relations are local to begin with, so no movement is required for ëcheckingí. In spite of these constraints, the theory has been able to account formally, explicitly, and economically for a wide range of constructions. This paper takes on one more, ënoun incorporationí in Micronesian languages. To paraphrase Mohanan, if the attempt succeeds at least as well as more powerful morpheme-based alternatives, the seamless analysis is preferable, and the burden of proof falls upon those who would claim otherwise. (b) The seamless analysis of incorporation The focus of attention in this paper is the ëincorporatedí forms. How should they be described in an explicit grammar? For reasons stated above, we can dismiss a purely syntactic description in which these forms are derived by transformational rules from sequences of a transitive verb followed by an Accusative NP. The analysis can only be lexical. It is moreover the contention of this paper that the analysis must be seamless, that is, noun incorporation constructions in Micronesian languages can and should be analyzed as having no internal non-phonological structure, no hierarchically related wordinternal subchunks, with no internal boundaries separating such subchunks. In the remainder of this paper I will review the NI patterns
158 ! Stanley Starosta
in six Micronesian languages, Kosraean, Marshallese, Mokilese, Pohnpeian, Woleaian, and Yapese. Using examples from these languages, I will summarize the evidence in support of a seamless analysis and present some explicit WFSs to illustrate how the observations can be accounted for in a generative8 lexicase dependency analysis. (c) Verb classification In the discussion of the Micronesian data, it will be necessary to introduce the relevant parts of the lexicase syntactic machinery, in particular, the system of verb classification. In dependency grammar, verbs are syntactically subcategorized according to their valence, that is, according to the kinds of dependents they allow. In lexicase dependency grammar, the most salient dimension of verb subcategorization is classification according to the CASE RELATIONS9 borne by noun-headed dependents (ACTANTS). Every verb requires one Patient (PAT) and once actor (actr) dependent, and every transitive verb has an Agent (AGT) dependent. Otherwise, a noun-headed dependent may be a complement or adjunct Locus (LOC), Correspondent (COR), or Means (MNS). Subcategorization in terms of case relations distinguishes 16 primary verb types, (as shown in Figure 7.4). The classes are labelled by the binary features ±trns, ±lctv, ±crsp, and ±mode, where +trns ëtransitiveí implies [?[AGT]], +lctv ëlocativeí implies [?[LOC]], +crsp ëcorrespondentí implies [?[COR]], +mode implies [?[MNS]], and [V] implies [?[PAT]] and [?[actr]]. It is commonplace in modern and traditional linguistics, following folk perception and conventional lexicographic practice, to think of two linguistic forms with the same pronunciation as a single word which may have multiple glosses and distributions. In a lexicase analysis, by contrast, words are distinct if they differ in pronunciation, meaning, or syntactic distribution,10 so that for example, English will have separate lexical entries for transitive drink (primary class 9) and intransitive drink (primary class 1).
Micronesian ëNoun Incorporationí Previous Analyses At this point it is time to get into some data. This is a bit frustrating, since much of the work on Micronesian languages in general, and
+crsp ?[COR]
ñcrsp
+lctv ?[LOC] +crsp ?[COR] ñcrsp
+crsp ?[COR]
ñlctv
+trns ?[AGT] +lctv ?[LOC] +crsp ?[COR] ñcrsp
1
2
3
4
6
7
8
9
10
11
12
Figure 7.4: Lexicase Verb Subcategorization by Case Relations 11
5
13
14
15
16
ñmode +mode ñmode +mode ñmode +mode ñmode +mode ñmode +mode ñmode +mode ñmode +mode ñmode +mode ?[MNS] ?[MNS] ?[MNS] ?[MNS] ?[MNS] ?[MNS] ?[MNS] ?[MNS]
ñcrsp
ñlctv
ñtrns
?[PAT] ?[actr]
Micronesian Noun Incorporation ! 159
160 ! Stanley Starosta
on Micronesian noun incorporation in particular, is vague and imprecise, and uninformed by advances in modern linguistics. There is much valuable data available and occasionally insightful observations, but the presentation and analysis is generally odd and idiosyncratic, so that it is sometimes difficult to find information on a particular topic. This appears to be not so much a matter of ignorance of linguistics, but rather of a strange but pervasive idea that there is some kind of contradiction between a theoretically informed linguistic analysis and a comprehensive and comprehensible one: In preparing this reference grammar, I have not followed any particular linguistic model of description but have tried to make the book as comprehensive and at the same time as detailed as possible, hoping that the result will be of some help to the Kusaiean people and to linguists who are interested in the Kusaiean language. (Lee 1975: xv) My purpose in writing this book has been to provide a description of the major grammatical features of Ponapean for the reader who has had little or no training in the analysis of language. Although this work is intended primarily for native speakers of Ponapean who are bilingual in English, I hope it will also be useful to others whose interests have brought them to the study of this language. .... The organization of this grammar is ultimately based upon the practical problem of providing a relatively nontechnical description of a language, rather than upon a particular theory of the organization of language itself. Consequently, I have drawn upon a variety of grammatical traditions in discussing this language .... (Rehg 1981: xiii) A major aim of this book is to provide as much of an overall picture of the sound system and grammatical structure of contemporary Woleaian as possible, using both traditional and current methods of linguistic description. (Sohn 1975: xi) However, I felt that it would be possible for the interested linguist to use the book in its present form, basically descriptive (rather than theoretical), but if it had been written in the full panoply of linguistic terminology it would have been inaccessible to most Yapese readers. (Jensen 1977: xix)
Micronesian Noun Incorporation ! 161
One exception to this pattern is Louise Pagottoís dissertation on Marshallese (Pagotto 1987), which is formulated in the rigorous lexicase dependency framework.
Micronesian Languages and Transitivity 1. The Micronesian transitivity ëcontinuumí Micronesian languages (a subgroup of the Oceanic subfamily of the Austronesian language family, spoken on islands in the Pacific Ocean between Polynesia and the Philippines and Taiwan) are a nice testing ground for the claim that a seamless analysis of noun incorporation is preferable to an analysis formulated in terms of segmentation. These languages exhibit a range of verbal constructions which could be thought of pre-theoretically as a continuum between intransitive and transitive (cf. Sugita 1973: 404). At one end are the pukka transitives. They allow a Nominative and an Accusative argument, the ësubjectí and ëdirect objectí, and these noun phrases may contain a full range of modifiers, relative clauses, etc. The Acc phrase depending on a transitive verb is definite (or quantified) and exhibits other characteristic semantic properties identified by Hopper and Thompson (Hopper and Thompson 1980) as markers of ësemantic transitivityí. Morphologically these verbs end in a substring associated with syntactically and semantically transitive verbs, and the remaining substrings are similar but frequently not identical to the other related forms in the continuum. The other verbs in the continuum differ syntactically, semantically, and morphologically from the true transitives. Morphologically, they differ from true transitives in lacking the transitive ending and in showing reduplication or some other phonological shape difference from the related transitives. Syntactically, they take a nominative complement and may have an additional locative or ablative complement. In Kosraean, the nominalizations of transitive verbs end in ..iyac] while nominalizations of intransitive verbs are identical in form to their corresponding verbs. True transitive verbs can be passivized with ..yuhk], but intransitives, including two-argument intransitives, cannot (Lee 1975: 273). These additional dependents of intransitives if present are associated with a lower degree of semantic transitivity relative to grammatically transitive two-argument clauses. The ablative NPs in
162 ! Stanley Starosta
Marshallese two-argument intransitives are indefinite, and the locatives (in Marshallese only) are semantically oblique in other ways. Hiroshi Sugita describes the continuum in terms of ëtransitiveí, ësemi-transitiveí, and ëintransitiveí verbs, with NI forms counting as intransitive. He notes that not all the segments of the continuum are present in every Micronesian language: We have found in the discussions above that the phenomenon of ëobject incorporationí is found only in Kusaiean and Ponapean and not in Trukese and Marshallese. On the other hand, we have found that definite noun phrases are allowed to be objects of ësemitransitiveí verbs only in Trukese and Marshallese and not in Kusaiean and Ponapean. (Sugita 1973: 404) Among the syntactically monovalent verbs at the extreme intransitive end of the spectrum are a subset which look as if they were formed from a combination of an intransitive ëstemí plus a single noun, though both the ëverbí and ënouní may differ in shape from their corresponding uncombined words. Such complex forms are often difficult to distinguish from a sequence of an intransitive verb followed by an ablative dependent, but typically there are morphological and syntactic clues that help us to identify the end of a verb form and decide whether the ënouní is inside of the verb or outside of it. 2. A generative analysis of the transitivity continuum Each check mark in Figure 7.5 represents a different verb class, so if a given ëwordí/ ëstemí/ ërootí occurs in each of five classes, say, there are five separate though possibly homophonous lexical entries. This analysis may be contrasted with the theoretically unfounded ëpolysemyí approaches to these languages, where two homophonous words are analyzed as a single word with two distributions: A construction where a neutral verb occurs is called a neutral construction. As we have seen, a neutral verb is different from an intransitive verb in that it may take an object. This object may be omitted when it is understood. (Woleaian, Sohn 1975: 246) (2) I be iul shal. ëI will drink water.í (3) I be iul. ëI will drink (something).í
Micronesian Noun Incorporation ! 163
In Sohnís analysis, thus, there is a single verb iul in these two examples. In a lexicase analysis, by contrast, verbs which occur in different syntactic environments belong to different syntactic classes and so are separate lexical entries. In these two examples, there would thus have to be two verbs, an intransitive verb iul ëdrinkí in (3) and a second verb in (2) which is either a two-argument verb iul ëdrink of somethingí or an intransitive NI verb iulshal ëdrink-waterí, depending on a detailed analysis of the examples. The regular relationships between any two classes, and between related lexical items in the two classes, are accounted for in terms of WFSs. 3. Verb classes and Micronesian transitivity Bearing in mind that homophony does not imply lexical identity in a strict formal analysis, the Micronesian ëtransitivity continuumí can be analyzed in the lexicase framework as composed of verbs from lexicase classes 1, 2, 3 and 9, where class 1 includes ëincorporatedí as well as simplex forms. The check marks in Figure 7.5 indicate V ñtrns
+trns
ñcrsp ñmode
+crsp +mode MNS ñdfnt
Verb class
1 simplex
1 ëincorporatedí
2
COR +dfnt 3
9
Language Chamorro Kosraean Marshallese
Å Å Å
Å Å 12
Å? Å
ñ Å
Å Å Å
Mokilese Pohnpeian Trukese
Å Å Å
Å Å ñ
Å? Å? Å
ñ Å
Å Å Å
Woleaian Yapese
Å Å
Å Å
Å Å?
Å ñ
Å Å
one argument ëintransitiveí
two argument semiëintransitiveí ëtransitiveí
Figure 7.5: Some Micronesian Verb Classes
164 ! Stanley Starosta
which of the languages considered in this paper have which of the verb types. Of these languages, only one, Marshallese, has been described within a rigorous and theoretically founded framework of analysis, so the relevant information for the other languages is sometimes difficult to extract from the descriptions. Unclear points are indicated by question marks.
Noun Incorporation in Micronesian Languages Most of the properties that distinguish NI verbs in Micronesian languages from verbs with external N dependents have been neatly summarized by Hiroshi Sugita: ëIncorporated objectsí in Kusaiean and Ponapean have the following characteristics: (1) They cannot be modified or quantified; (2) they cannot be moved out of their positions; (3) nothing can be inserted between them and preceding verbs; and (4) verbal suffixes follow them rather than verbs. All these characteristics suggest that they are bound to verbs and that they do not constitute any independent noun phrases as their ëobjectsí. I believe that we should consider a combination of an ëintransitive verbí and its ëincorporated objectí in these languages a special kind of compound verb. (cf. Lee 1975: 270ñ71; Sugita 1973: 404) Other identifying criteria turn up in individual languages. For example, in Yapese grammatically transitive verbs take one class of ësubject markersí, while all other verbs, including two-argument intransitives and single-argument NI and simplex verbs, take another set (Jensen 1977: 125, 217). In Kosraean, instrumental focus transitive verbs ending in ..kihn] correspond to intransitive verbs, including NI intransitives, but not to transitives (Lee 1975: 273ñ74). So, NI verbs are single-argument intransitives. What does that tell us about two-argument verbs, then? One swallow does not make a summer, and two arguments do not make a transitive verb. When a language has more than one two-argument construction, it will typically turn out that a consideration of syntactic, morphological, and semantic properties shows one of the constructions to be transitive and one to be a two-argument intransitive construction (see Gibson and Starosta 1990, Ho 1993, Huang 1994, Starosta 1988, 1998). This same phenomenon appears in Micronesian languages,
Micronesian Noun Incorporation ! 165
but it has been characterized in different ways because of the theoretical anarchy endemic in this area. In general, theoretically unchurched Micronesianists call the second argument an ëobjectí and leave it at that. Of the theoretically unfounded analyses, Hiroshi Sugitaís comes closest to Pagottoís lexicase analysis of Marshallese, the one to be adopted in this paper: ëOne might, then, attempt to distinguish the object of a semitransitive verb from that of a transitive one by assigning the former to a ìcaseî category other than that of ìobjectîí (Sugita 1973: 405). The following Marshallese examples will illustrate some of the properties just introduced. Transitive [+trns] In Marshallese, grammatically transitive verbs have a different shape from other related verbs, and the Acc-PAT ëdirect objectsí are definite or quantified. (4) (Pagotto 1987: 463, 69c; SS reanalysis) j n ear13 John 3s-CMPL AGT +trns actr ëJohn ate my fish.í
ka¤ eat PAT
ek fish
eo the
(5) (Pagotto 1987: 467, 71a; SS reanalysis) iar 1s-CMPL
ilim drink +trns m[AGT]14 m[actr] ëI drank a bottle.í
juon one PAT
bato bottle
(6) (Pagotto 1987: 468, 71b; SS reanalysis) iar 1s-CMPL
ainbate cook +trns m[AGT] m[actr] ëI cooked a pandanus.í
juon one PAT
böb pandanus
ko a catch-1s
166 ! Stanley Starosta
Intransitive [ñtrns] Intransitive verbs may have one argument or two. The two-argument verbs may be +crsp, with a definite COR dependent, or +mode, with an indefinite MNS dependent. (a) Correspondent [+crsp, ?[COR,+dfnt]]15 (7) (Pagotto 1987: 463, 69b; SS reanalysis) ear j n John 3s-CMPL PAT actr ëJohn ate of my fish.í
ö¤ä eat ñtrns +crsp
ek fish
eo the
koƒa catch-1s
(b) Means [+mode, ?[MNS, ñdfnt]] The +mode verbs are semantically like NI verbs in taking indefinite ëobjectsí. However, they can sometimes be identified by an intruding question adverb ke (8) or when they follow a verb-final directionmarking substring (9ñ11) or when the verb and ëobjectí have fused phonologically (12), or when the clause is clefted (13): (8) (Pagotto 1987: 412ñ13; SSís reanalysis) kwöj 2s-PROG
idaak drink ñtrns +mode m[PAT] m[actr] ëAre you drinking punch?í
ke Q
ban? punch MNS
(9) (Pagotto 1987: 414, 33b; SSís reanalysis) kwön 2s-PTTV
aljektok carry-DIR ñtrns +mode m[PAT] m[actr] ëGather copra nuts.í
waini copra MNS
Micronesian Noun Incorporation ! 167
(10) (Sugita 1973: 403e.) yehar m¢egayteq he-past eat-hither ëHe came eating fish.í
yek fish
(11) (Pagotto 1987: 682, 60b; SS reanalysis) kwaar 2s-CMPL
wiatok jimee jän1 ia? buy-DIR cement from where ñtrns MNS LOC +mode +lctv m[PAT] m[actr] ëWhere did you buy that cement from?í; cf. [..tok] [V,+drcn] (12) (Pagotto 1987: 681, 60a; SS reanalysis) rej ileek 3pl-PROG string-fish ëThey are fish-stringing.í; cf. ilele ëto stringí, ek ëfishí (13) (Sugita 1973: 403f.) yek men yew fish thing the ëIt was fish that he ate.í
yehar he-past
m¢egay eat
(c) Single argument Single argument verbs may be morphologically unincorporating or of the NI type. (1) unincorporating (14) (Pagotto 1987: 397, 19b; SS reanalysis) l l in ej earth this 3s-PROG PAT ëThis earth is spinning.í
r l l spin ñtrns
(15) (Pagotto 1987: 208, 146a; SS reanalysis) kwön 2s-PTTV
oktaktok turn-around-toward-point-of-reference ñtrns m[PAT] ëTurn around and face me.í
168 ! Stanley Starosta
(2) NI ëincorporatedí (16) (Pagotto 1987: 413, 32b; SS reanalysis) kwöj 2s-PROG
idaakban drink-punch ñtrns m[PAT] ëAre you drinking punch?í
ke? Q
(17) (Pagotto 1987: 414, 33a; SS reanalysis) iar 1s-CMPL
ö¤äworl k oom eat-lobster-DIR until ñtrns m[PAT] ëI ate lobsters until I was absolutely full.í
aal full
(3) Ambiguous If no verb-terminating adverb or directional sequence is present, a sentence may be ambiguous between an incorporating analysis and a +mode analysis. This factor presumably enabled the abductive change process that produced ënoun incorporationí in the first place. (18) (Pagotto 1987: 463, 69a; SS reanalysis) j n John PAT
ear 3s-CMPL
ëJohn ate fish.í j n ear John 3s-CMPL PAT ëJohn fish-ate.í
ö¤ä eat ñtrns +mode
ek fish MNS
ö¤äek eat-fish ñtrns
(19) (Pagotto 1987: 418, 36; SS reanalysis) rej 3pl-PROG PAT
kattul k armej CAUS-submerge person ñtrns +mode +lctv ëThey are dunking people in the lagoon.í
i in MNS
ar lagoon LOC
Micronesian Noun Incorporation ! 169
rej 3pl-PROG PAT
kattul karmej i CAUS-submerge-person in ñtrns +lctv ëThey are people-dunking in the lagoon.í
ar lagoon LOC
3. Additional manifestations of transitivity Woleaian two-argument questions are interpreted as perfective with morphologically transitive verbs, but as imperfective with morphologically intransitive verbs, a correlation between morphological and semantic transitivity which Hopper and Thompson (1980) also found in other languages. ëA question word may be the object of either a transitive or a neutral verb, depending on whether the speaker considers the object specific or not. Thus, the following two constructions are both grammatical with slightly different meaningsí (Sohn 1975: 251). (20) (Sohn 1975: 251; SSís analysis) Ye foori metta? AGT +trns PAT ëHe did what?í (transitive) (21) (Sohn 1975: 251; SSís analysis) Ye ffoor metta? PAT ñtrns MNS ëHe is doing what?í (neutral) Sohnís examples donít show differences in specificity of the object but rather in aspect. By Hopper and Thompsonís criteria, (20) would be expected to be transitive and (21) intransitive, and this is consistent with the morphological shapes of the verbs. 4. Seamless analysis The seamless analysis of ënoun incorporationí is identical to the analysis of other kinds of lexical derivation: a WFS summarizes an analogical pattern relating two sets of words which are more or less related to each other in shape, distribution, and meaning. An example from Marshallese is what might be called the ëconsumption ruleí:
170 ! Stanley Starosta
(22) Consumption noun: verb rule N aFi
[
:
:
V aFi ñtrns +cnsm +hbtl [ ö¤ä
cnsm = ëconsumeí hbtl = habitual
To paraphrase, for every noun of a particular shape and having the semantic features [aFi], there may be a verb which is identical in shape except for the initial sequence [ ö¤ä.., and which means something like ëhabitually consume aFií. The sequence [ ö¤ä.. is referred to as the CONSTANT, the part that is present in all members of the set, and the remaining parts of the words in the set are the VARIABLES. Example: (23) (Pagotto 1987: 681, 33a) iar ö¤äworl k oom 1s-CMPL eat-lobster-DIR until ëI ate lobsters until I was absolutely full.
aal full
Here the variable part of the word is double-underlined. As in the case of other derivation processes, there will be a separate WFS for each ëincorporationí pattern. Thus another WFS will be required for ëstringingí Ns: (24) Stringing noun: verb rule N aFi
[
:
:
V aFi ñtrns +strg +hbtl [ile
;strg = ëstringí :hbtl = habitual
(25) (Pagotto 1987: 681, 60a, SSís reanalysis) rej ileek 3pl-PROG string-fish ëThey are stringing fish.í; cf. ilele ëstringí, ek ëfishí. Here the features [N,aFi] characterize some set of nouns, and [V, aFi, ñtrns, +strg, +hbtl] characterize the syntactic and semantic
Micronesian Noun Incorporation ! 171
features common to a set of corresponding verbs. The nouns differ phonologically from the verbs in lacking an initial substring [ile.. which all the verbs have. The semantic effect of such incorporation rules has been nicely summarized by Harrison (1976) and Lee (1975): The final distinction between the transitive verb with its object and the derived intransitive verb with its included object must be drawn in terms of meaning. An exact characterization of the distinction remains a problem to be solved. The only tentative distinction that has been made at this moment is that the included object restricts the range of potential reference of a verb. (Lee 1975: 277) An incorporated object construction is still an intransitive verb; it is merely one that names a more specific kind of activity (joai jahr ëknife sharpeningí rather than simply joaijoai ësharpeningí for example). It is as if the addition of a noun refines the meaning of the verb in question, limiting its application to the set of objects named by the noun. (Harrison 1976: 162) I concur with this statement, except that I would of course not use the phrase ëaddition of a nouní. Except for the minimal semantic information given in the WFS, the semantic properties are not completely predictable, and depend on the situation in which the word was coined. 5. Morphology of incorporation (a) Relative ordering of ëincorporated nouní and other verbal elements The seamless analysis is consistent with the characteristic properties of Micronesian NI verbs. For example, it accounts for relative ordering of ëincorporated nounsí and directional or aspectual sequences. In most of the languages exemplified in (28), it is possible to suffix verbs with a variety of morphemes indicating aspect or direction. In these languages, when a transitive verb is suffixed, the verbal affix is attached to the verb. However, in an incorporated object construction, the verbal suffix appears after the noun: .... (Pagotto 1987: 411ñ12)
172 ! Stanley Starosta
Seamlessly speaking, the ëincorporated nouní is an undifferentiated part of the verb, and the verb form as a whole is the input to aspectual and directional WFSs. The following illustrative examples are taken from Pagotto (1987: 412). The word-internal spaces and morpheme boundaries have been removed from the words, and hyphens have been added to the glosses for (26) and (28): (26) Mokilese (Harrison 1976: 163) ngoah kooaringla 1s grind-coconut-CMPL ëI finished coconut-grinding.í (27) Pohnpeian (Rehg 1981: 214) i kengwiniher 1s eat-medicine-CMPL ëI have medicine-taken.í (28) Kosraean (Lee 1975: 271) el twetwemitmitlac 3s sharpen-knife-CMPL ëHe has knife-sharpened.í In each case, the completion substring appears at the end of the verb, oblivious to the fact that the preceding substring is etymologically related to a free noun. (b) ëCombining formsí The NI phenomena in Pohnpeian add further support to the seamless analysis. Here are some representative examples. (29) Pohnpeian (Rehg 1981: 209ñ10) (a) I pahn ihkose likou ehu. ëI will pleat a dress.í (b) I pahn ihkos. ëI will pleat.í (c) I pahn ihkoslikou. ëI will dress-pleat.í Rehg distinguishes free verb forms from ëcombining formsí, the shapes that verbs assume in incorporating constructions. The shape may be identical to the shape of the free intransitive verb, e.g., (Rehg 1981: 210)
Micronesian Noun Incorporation ! 173
(30) Transitive
Intransitive
Combining Form16
lek ngkoal peiek perek
lekngkoalpeiekperek-
leke ngkoale peieki pereki
English ëto ëto ëto ëto
slash or castrateí make sennití slideí unrollí
but in some cases it is different from both the transitive and intransitive free forms (Rehg 1981: 210). (31) Transitive
Intransitive
daper par rese
dapadap pereper rasaras
Combining Form
English
dapperras-
ëto catchí ëto cutí ëto sawí
Rehg does not explain why the verbs change their forms when they incorporate object nouns. From the seamless point of view, of course, there is no change to be accounted for, since the ëcombining formsí are not verbs at all, but constants in a WFS. Their connection with free verbs is etymological rather than synchronic. Historically there were verbs which served as models for these constants, and then evolved separately into modern transitive and intransitive verbs via suffixation, reduplication, etc., while the constants, not being themselves verbs or even words, were not subject to these processes. The situation for Mokilese is similar to that in Pohnpeian: the ëverbsí that occur with ëincorporated nounsí are phonologically distinct from free verbs (Harrison 1976: 296ñ97): 11.7.3.5 This process, as described in 6.5.6, creates an intransitive verb from a compound of the intransitive form of a bi-transitive verb (often a special unreduplicated ëcombining formí17 in the case of reduplicated intransitives) and a noun. These incorporated object constructions are analogous to English two-word verbs like ëbaby-sití. For example: (32) (33) (34) (35)
dok mwumw oakoarok jeri poad suhkoa rik sakai
ëto spear-fishí ëto baby-sití ëto plant treesí ëto gather stonesí
174 ! Stanley Starosta
In a seamless analysis, these forms would be written without the spaces, since they have no internal structure. The monolemmatic status of such forms is especially clear in two additional examples (Harrison 1976: 297): On occasion, the elements of an incorporated object construction are bound so tightly as to become one word phonologically: (36) johnoai ëto make a fireí (from jaun ëto feed a fireí + oai ëfireí) (37) jileimw ëto watch the houseí (from jiloa ëto watchí + umw ëhouseí) From the seamless perspective, all these Mokilese NI constructions are single words phonologically. What is special about the latter two examples is that historical phonological changes have made the etymological connections with free verbs more obscure than usual. Adoption of the seamless analysis would make the description of Mokilese verb morphology more compact and less arbitrary. For example, Harrison describes a class of ëunreduplicated intransitive verbsí with special distributional properties (Harrison 1976: 156ñ58): A number of bi-transitive verbs have both a reduplicated and a nonreduplicated intransitive form, the former being a reduplicated form of the latter.... In verbs of this type, the reduplicated form of the intransitive is the most widely used. It is typically (but not always) used actively (with an agent as subject) to name the activity in question (see 9.1). In such cases, the unreduplicated form of the intransitive cannot be used. For example: (38) (Mokilese; Harrison 1976: 157, 85a) Ngoah joaijoai ëIím sharpening.í (39) (Mokilese; Harrison 1976: 157, 85b) *Ngoah joai. ëThe most common use of the unreduplicated intransitive is in INCORPORATED OBJECT CONSTRUCTIONS (see Section 6.5.6) like: (40) (Mokilese; Harrison 1976: 157, 86) ngoah joai jahr. ëI am knife-sharpening.í
Micronesian Noun Incorporation ! 175
ëIt is also often used as a stative verb describing the result of the action named by the corresponding active verb. (41) (Mokilese; Harrison 1976: 158, 87a) ngoah pwalpwal. ëI am chopping.í (42) (Mokilese; Harrison 1976: 157, 87b) ngoah pwal suhkoa. ëI am tree chopping.í (43) (Mokilese; Harrison 1976: 157, 87c) suhkoahu pwalpijoang. {7} ëThe tree has been split apart.í (44) (Mokilese; Harrison 1976: 157, 87d) ngoah pwal aio. ëI was operated on yesterday.í Examples 87a and 87b are both active, in which the subject is the person performing the action, while 87c and 87d are stative, where the subject is the person (object) affected by the action. Only the reduplicated intransitive can be used in sentences, like 87a, which name an activity without reference to the object towards which the activity is directed. The simple (unreduplicated) intransitives can only appear in sentences in which the object is expressed (either in incorporated object constructions, or in stative sentences where the affected object is the subject). (Harrison 1976:157) The seamless analysis simplifies and motivates this list of otherwise arbitrary stipulations. The reduplicated intransitives are intransitive activity verbs, and the unreduplicated intransitive in (44) is a stative intransitive verb. Other ëunreduplicated intransitivesí are not verbs at all, but word-internal derivation-marking strings of phonemes. In addition to bi-transitive verbs lacking a simple (unreduplicated) intransitive form, there are a number of verbs that have only a simple intransitive form, but one that cannot be used in simple active event-naming sentences like 85 and 87 above. Thus, the verb kohkoat ëto grind somethingí has an intransitive form koi that
176 ! Stanley Starosta
can be used only in incorporated object constructions or stative sentences. For example: (Harrison 1976: 159) (45) (Mokilese; Harrison 1976: 159, 90a) ngoah kohkoa oaringkai. ëIím grinding these coconuts.í (46) (Mokilese; Harrison 1976: 159, 90b) ngoah ko oaring. ëIím grinding coconut.í (47) (Mokilese; Harrison 1976: 159, 90c) oaringkai kohla. ëThese coconuts have been ground up.í (48) (Mokilese; Harrison 1976: 159, 90d) *ngoah ko. ëIím grinding.í Seamless explanation: there is no intransitive activity verb ëto grindí; it is a lexical gap. The form [ko.. is not a verb but the constant in a WFS. Like some other Micronesian languages, Mokilese has suppletive forms for transitive and intransitive ëeatí (Harrison 1976: 156). ëNote that the verb mwinge ìto eatî is an active intransitive used to name an activity. The verb kang cannot so be used.í (49) (Mokilese; Harrison 1976: 156, 83a) Ngoah pirin mwinge. ëIím going to eat.í (50) (Mokilese; Harrison 1976: 156, 83b) Ngoah pirin kang. ëIím going to eat it.í Example (83b) can only be used when the speaker is going to eat a specific thing. However, the verb mwinge cannot be used in incorporated object constructions, while kang can: (51) (Mokilese; Harrison 1976: 156, 84a) *Ngoah pirin mwinge rais. (52) (Mokilese; Harrison 1976: 156, 84b) Ngoah pirin kang rais. ëIím going to eat rice.í
Micronesian Noun Incorporation ! 177
ëFor this reason, I conclude that kang is both transitive and intransitive, though restricted in its use as an intransitive formí (Harrison 1976: 156). From the seamless point of view, there is only one transitive verb kang, which has the same distribution as other Mokilese transitive verbs. However, there is also a derivational constant [kang.. which appears as a constant in an ëincorporationí WFS, as well as another derivational constant [keng.. and an intransitive verb mwenge ëto eatí. This seamless analysis has interesting implications for the reconstruction of Micronesian verbal morphology: the ëverbsí in NI constructions derive historically from verbs which existed at a time before the suppletive intransitives were innovated. However, that question lies beyond the scope of this paper. A very similar pattern shows up in Pohnpeian. Again, there are suppletive forms for intransitive ëeatí (mwenge) and transitive ëeatí (kang), and again, the ëcombining formí for ëeatí (keng-) is more similar to the transitive verb (Rehg 1981: 195, 202, 211). (53) Transitive Form kang (54) Noun wini
English ëmedicineí
Combining Form keng-
English ëto eatí
With a Verb kengwini
English ëto medicine-eatí
Not surprisingly, the form of Pohnpeian incorporated nouns may also be different from that of related free forms (Rehg 1981: 212). (55) Pohnpeian i pahn pereki 1s will unroll-TR ëI will unroll that mat.í
lohso. mat-that
(56) Pohnpeian i pahn 1s will ëI will unroll mats.í
pereki unroll-TR
(57) Pohnpeian i pahn 1s will ëI will unroll mats.í
pereklos. unroll-mat
lohs. mat
178 ! Stanley Starosta
The difference in the length of the vowel of lohs in the phrases pereki lohs and pereklos is a consequence of the fact that a noun following a combining form of a verb is usually tightly bound to that verb in a manner analogous to compounding. Thus, the monosyllabic noun vowel lengthening rule that produces a form like lohs does not apply here, since los is part of the verb and is not standing alone as an independent word. This is one reason why combining forms of verbs are commonly written together with the nouns that follow them. (Rehg 1981: 212) These forms thus cannot be compounds because they are not composed of two free forms. Rehgís explanation of the difference in shape between free nouns and nouns in compounds is circular: we can infer that ënounsí like los are ëtightly boundí because they are shorter than their free counterparts, and they have to be shorter because they are ëtightly boundí. In a seamless analysis, by contrast, there is nothing to explain: The Monosyllabic Noun Vowel Lengthening Rule (Rehg 1981: 117ñ19) didnít apply to ..los] because ..los] was and is not a noun.18 It is not true, however, that the lengthening of the vowels of monosyllabic nouns never takes place in constructions of this type. In forms like the following, lengthening does occur. (Rehg 1981: 117) (58) Noun uhk uht
English ënetí ëbananaí
Combining Form English dowuhk ëto net-weaveí sapuht ëto banana-harvestí
With the construct suffix, the vowels in these nouns are short, as demonstrated below. (59) Noun uhk uht
English ënetí ëbananaí
Construct Form ukin utun
English ënet ofí ëbanana ofí
The fact that these vowels remain long with combining forms of verbs is somewhat unexpected. This may be due to the difference in the degree to which the native speaker thinks of these twoword verbs as single units. (Rehg 1981: 213) Again, Rehgís ëexplanationí for these facts is circular. The speakers donít shorten these vowels in (58) because they are not thought
Micronesian Noun Incorporation ! 179
of so much as single units, and we know that the speakers donít think of these forms so much as single units because they donít shorten the vowels. Assuming that the examples in (58) really are NI and have not been simply misanalyzed, a better explanation could be that the exceptional forms were historically long to begin with, before the point of time at which they were reanalyzed as derivation markers in NI constructions. The shortened vowel in the construct form would then be the result of analogical shortening under the pressure freeóshort alternations in numerous other long free nounóconstruct noun pairs.
Advantages of a Seamless Analysis In summary, then, what are the advantages of a seamless analysis of ënoun incorporationí? 1. Generativity. The seamless analysis accounts for the properties of NI patterns formally and explicitly by means of conceptually simple Word Formation Strategies. 2. Expressive power. There is no need for powerful (and never explicitly stated) transformational rules to create NI structures in a syntactic formation analysis, or for empty-headed NPs in a lexical analysis, or for ad hoc ëprinciplesí19 to rationalize unexpected behaviour. And of course, as befits a seamless analysis, no rules create or refer to unnecessary word-internal constituent boundaries or non-phonological hierarchical structure. 3. Syntactic and anaphoric irregularities. ëIncorporated nounsí are non-referential, canít have their own dependents,20 and canít be ëmovedí.21 Reason: they arenít nouns; they are wordinternal strings which have no syntactic category, and thus are subject to no syntactic or anaphoric requirements. 4. Non-identity of free and bound shapes. ëIncorporated nounsí and ëincorporating verbsí may differ from corresponding free verbs because they are not nouns and verbs but only recurring strings of sounds contained in word forms. 5. Internal inflection. One problem that has been lurking in the shadows of compound analysis since the beginning is the question of internal inflection: if compounds are composed of free words, why do the components have to be uninflected words?
180 ! Stanley Starosta
The problem is especially acute for languages in which all nouns and verbs are inflected. In these languages too, paradoxically, the forms in compounds are uninflected. This presents a dilemma: if all free forms are inflected, yet the forms inside of compounds are not inflected, how can we maintain the definition of compounds as words composed of free forms? A common evasion is to state that in such languages, we must talk about compounds as composed of bound stems rather than of free forms. (Chao 1968: 145ñ46; Sapir 1911). ëInstead of requiring the constituents of a compound to be free words, we shall take compounds in a wide sense so as to include those in which the parts are bound morphemes other than affixesí (Chao 1968: 359). At that point, the strict definition of compound no longer applies, and the distinction between compounds and derived becomes fuzzy and subjective. So what about the seamless analysis then? The answer is quite straightforward: ëcompoundsí are not composed of free forms at all, since words have no internal structure. ëIncorporated nounsí and ëincorporating verbsí fail to take noun and verb inflections for the same reason: they are not nouns or verbs. A compound, like any other morphologically derived word, can be parsed as a constant string and a variable string. Neither of these strings is a lexical unit, and since neither belongs to a syntactic class, neither takes the inflections required of members of that class. 6. Non-compositionality. Everyone knows that compounding, including ënoun incorporationí, is semantically non-compositional; the meaning of the whole word is not predictable from the meaning of the component words.22 In a compounding analysis, this must simply be stipulated and has never been explained in linguistic terms. The seamless analysis, though, accounts for this property directly: there are no ëcomponent wordsí and so ënon-compositionalityí is meaningless. Instead, as Starosta et al. showed for Chinese ëcompoundsí (Starosta et al. 1998: 361ñ62), semantic properties of individual patterns turn out to be remarkably consistent when the patterns are viewed as derivation rather than compounding. 7. Hierarchy of incorporation. The argument that normally gets ëincorporatedí in NI is not the ëdirect objectí but the Patient (PAT) in the lexicase sense. It has to be the Patient rather
Micronesian Noun Incorporation ! 181
than the ëdirect objectí which gets incorporated because: (1) ergative languages donít have direct objects but may incorporate Patients, and (2) Patient incorporation also includes intransitive subject incorporation in languages such as Thai.23 In a seamless lexicase analysis, one WFS fits all. The process chooses PAT because it is the central case relation, the one that occurs obligatorily with every verb and is conceptually most closely associated with it. As such, it is also normally the one closest in linear sequence to the verb, and thus more likely to get reinterpreted as part of the verb. In Comrieís view (Comrie 1978: 337), the hierarchy of incorporation is P>S>A. In a lexicase analysis, both ëPí and ëSí are PAT, so the hierarchy reduces to PAT>AGT. 8. Evolution. It is sometimes recognized in the literature that components of compounds can evolve into derivational affixes: Less often, a relatively independent form is reduced to affixal status. Compound-members are occasionally reduced, by sound change, to suffixes; thus, the suffix -ly (manly) is a weakened form of like, and the suffix -dom (kingdom) of the word doom. This happens especially when the independent word goes out of use, as in the case of -hood (childhood), which is a relic of an Old English word [ha:d] ëperson, rankíí (Bloomfield 1933: 414ñ15). From the seamless point of view, the reason that the elements of compounds can evolve like derivational affixes is that they are (the seamless equivalent of) derivational affixes to start with. They are thus subject to a different set of phonological changes than those that affect the corresponding free nouns. The loss of an etymologically related free form, such as the hood reflected in childhood, just makes the mistaken compounding analysis more untenable.
Disadvantages of the Seamless Analysis? Capturing the Similarity between Word Structure and Phrase Structure If theories and analyses are evaluated in accordance with the generalizations they achieve, then doesnít a purely seamless analysis
182 ! Stanley Starosta
lose out in failing to account for parallels between word structures and phrase structures? ë... for their part, non-syntactic approaches often do not address broad similarities between the structure of heads and that of phrases, as our discussion of clitics and synthetic compounds suggestsí (Hendrick 1995: 341). In fact, the exact opposite is true. First of all, there are crucial differences between phrases and words; in particular, words in phrases allow a certain range of dependents, which may in turn have dependents of their own. That is, syntactic structures are potentially recursive. Word-internal ëwordsí in ëword structuresí by contrast do not allow the same range of dependents as free words do, and normally do not allow any dependents at all. In Micronesian NI analyses, this fact has been occasionally recognized but not accounted for. The closest thing I have found to a serious explicit analysis of the phenomenon is a single diagram in Shelly Harrisonís reference grammar of Mokilese which somehow escaped the keen eyes of the Theory Police: (60) Mokilese (Harrison 1976: 163) S NP
Pred VP V
N
Suffix
ngoah ko oaring ñla 106b Ngoah ko oaringla
ëI finished coconut grinding.í The Pred and VP nodes date this as a probable reflection of Chomskyís Aspects model (Chomsky 1965), while the attachment of verbal affixes directly under the VP indicate the influence of the misuse of the VP node in much work on Oceanic and especially Polynesian linguistics (Jensen 1977: 192ñ93, cf. Pawley 1961: 43,24 Pawley 1970: 317ñ18 and Rehg 1981: 255). It is syntactically inadequate, because first of all Suffix is not a word. It does not have a syntactic distribution, and so does not have a word class, thus it should not have a word node in a syntactic analysis. Second, in the light of dependency-based analyses, including Chomskyís X-bar theory, every X should have an X-bar projection.25 However, the N in (60) has no projection. This strategem is needed to prevent the N from having
Micronesian Noun Incorporation ! 183
its own dependents, but it has no theoretical justification. It is simply an ad hoc hack to evade empirical falsification. The second problem with the claimed parallels between phrase and word structure is history. We know that Hendrickís ëbroad similarities between the structure of heads and that of phrasesí are historical accidents that cannot and should not be accounted for in a synchronic grammar. We know this because when syntactic typology changes, word structure patterns that reflect the older syntactic patterns may remain behind. Thus Sora, a Munda language spoken in eastern India, has the SXV clause structure expected in the South Asian language area (Masica 1991: 332ñ33; in this passage, Masica describes the word order in terms of NPs and Vb rather than in terms of ësubjectsí), while Sora NI structures (Biligiri 1965; Starosta 1967, 1989ñ90: 87) reflect the older ëVXí syntactic pattern which is typical of related Mon-Khmer languages in Southeast Asia (Donegan and Stampe 1983). Compare (61) and (62). (61) anin
nsel n
ad
a le
he/she woman her-body got AGT PAT +trns actr ëHe acquired the woman.í (ëSOVí) (62) anin
a bo le
he/she got-woman-PAST PAT ñtrns actr ëHe got married.í (ëSVOí; [..bo ..] is the ëcombining formí of nsel n ëwomaní.) Apparent word-internal structure is then independent of syntactic structure, and only appears to be subject to the same rules if it hasnít changed since the time the word-formation patterns crystallized. The seamless analysis is consistent with this fact, since it analyses ëword structureí and phrasal structure as independent of each other, and so predicts that the morphological shape will not change to follow changes in syntax. For a syntactic analysis of NI formation, by contrast, this is a serious embarrassment, and would require ad hoc rationalizations in place of the explanation provided by the seamless approach.
184 ! Stanley Starosta
LOC with Internalized PAT in its Scope One of the generalizations that can be captured neatly in a lexicase grammar is that the semantic scope of a complement case relation is the Patient of the verb, and that this applies in both transitive and intransitive sentences. This is illustrated in the following Marshallese sentences: (63a) (Pagotto 1987: 402, 22b) Aisen ear Aisen 3s-CMPL PAT ëAisen went to Hawaiíi.í
etal go
¤an1 to
Awai Hawaiíi LOC
(64b) (Pagotto 1987: 482, 82a) ane k etan Kw•ön 2s-PTTV kill[TR] DIR name-3s (AGT) PAT ëCross his name out of that book.í
jän from
bok e book that LOC
Thus in both (63a) (intransitive clause) and (63b) (transitive clause), the locative complement specifically describes the location of the Patient, not of the agent or of the action as a whole. In (63a), the PAT Aisen is moving to the LOC Awai ëHawaiíií, while in (63b), the PAT etan ëhis nameí moves away from the LOC, bok ƒe ëthat bookí. However, this generalization at first appears to hold in the case of ëincorporated objectsí only if the LOC can refer to a PAT contained inside the verb. This is illustrated in examples (65ñ67).26 (65) Pagotto 1987: 416, 35a; SSís analysis27 Rej kö anlodidea¤ jän1 kimjän ni 3pl-PROG make-pinwheel from leaf-of coconut m[PAT] ñtrns LOC ëThey are making pinwheels from coconut leavesí; cf. kö an ëmakeí, lodidea¤ ëcoconut leavesí. (66) Pagotto 1987: 417, 35b; SSís analysis kwöj aljek weiuk ¤an1 ia ? 2s-PROG carry-goods to where m[PAT] ñtrns LOC ëWhere are you taking those goods?í; cf. aljek ëcarryí, weiuk ëgoodsí.
Micronesian Noun Incorporation ! 185
(67) Pagotto 1987: 418, (36); SSís analysis Rej kattul karmej i ar 3pl-PROG CAUS-submerge-person in lagoon m[PAT] ñtrns LOC ëThey are dunking people in the lagoon.í; cf. kattul k ësubmerge (tr.)í, armej ëpeopleí. In each of these examples, the PAT is the contextually implied performer of the action, ëtheyí, ëyouí, and ëtheyí respectively. However, in each example the LOC constituent seems to locate the notional ëobjectí rather than the grammatical PAT. Thus in (65), the pinwheels emerge from coconut leaves, in (66) the goods move to ëwhere?í, and in (67) the people end up in the lagoon. This seems to contradict the lexicase prediction that the LOC refers to the location (or source or destination) of the grammatical PAT, and that an incorporated ëobjectí should be grammatically and anaphorically inaccessible. The Marshallese situation is paralleled in one important respect to a set of examples in English, illustrated by (68) and (69). (68a) (cf. 24a, Starosta 1988: 133) John spat1
his bubble gum
AGT +trns PAT +drcn +motn
into the fountain. LOC
(68b) (cf. 24, Starosta 1988: 132) John PAT
spat2 ñtrns +drcn ñmotn +xpls
into the fountain. LOC
+xpls = expulsion
(69a) (cf. 25a, Starosta 1988: 133) The troops fired1 rubber bullets at the demonstrators. AGT +trns PAT LOC +drcn +motn
186 ! Stanley Starosta
(69b) (cf. 25, Starosta 1988:133) The troops fired2 on the demonstrators. PAT ñtrns LOC +drcn ñmotn +xpls (69c) *The troops fired2 rubber bullets on the demonstrators. In a lexicase grammar, forms with distinct grammatical distributions are different lexical entries, so spat1, spat2, fired1 and fired2 are four distinct words. As is illustrated (69c), the intransitive verbs cannot in general be analyzed as transitive verbs with optional objects. In (68a) and (69a), as expected, the PAT is interpreted as moving to the LOC. In (68b) and (69b), however, the situational object is not grammatically encoded at all and the verbs instead encode an intransitive action oriented toward a LOC. In this sense, spit into and fire on are grammatically identical to peered into in (70). (70)
The witch peered into the microwave. PAT
ñtrns LOC +drcn ñmotn
In Starosta 1988, it is argued that the most general analysis of these data would be to treat (68b), (69b) and (70) as grammatically identical, so that the object expelled in (68b) and (69b) is grammatically irrelevant; it is of course present in the situation and must be expressed in the lexically related (68a) and (69a), but it is not part of the perception encoded in (68b) and (69b). In this paper I extend this same analysis to the Marshallese constructions illustrated in (65ñ 67). Thus corresponding to (66), for example, there is a grammatically identical clause (71) which does not involve NI: (71)
(Pagotto 1987: 120, 75d) ear 3s-CMPL m[PAT]
jab e tok not pay-attention ñtrns +drcn ñmotn ëHe didnít pay attention to me.í
¤an1 to LOC
¤a 1s
Micronesian Noun Incorporation ! 187
Again, as with the English verbs spat2, fired2 and peered, e tok ëpay attentioní (67) has no ëincorporated objectí morphologically, and there is nothing which is conceptually encoded as moving to the LOC. All are intransitive locative verbs encoding an action oriented toward a location. I propose to extend this same analysis to the Marshallese verbs kö anlodidea¤ ëmake-pinwheelí (65), aljek weiuk, ëcarry-goodsí (66), and kattul karmej ëdunk-peopleí (67). The evidence for this analysis is the language-internal and cross-linguistic generalizations thereby captured and the general theoretical constraints thereby made possible. In each of these cases, the intransitive locational verbs are lexically and etymologically related to their transitive counterparts. This happens to be marked by the ëincorporated objectí substrings in some Marshallese examples, including (65ñ67), but not in (68b) or (69b). Perceptually, grammatically and anaphorically the ëobjectsí are not there, and the ëincorporated objectsí are just fossils that help to mark the incorporating forms as grammatically and semantically different.
Stranding One potential problem for a seamless analysis is the phenomenon which has been referred to as ëstrandingí (Rosen 1989: 296). At first glance, ëstrandingí appears to support a syntactic analysis of NI verbs, since it appears to be a case in which the head N of a noun phrase has been incorporated into the verb while leaving its modifiers stranded outside, and one might think that there is no way in which such a process could take place in the lexicon. The stranding phenomenon is also found in Micronesian languages, in particular in Yapese, where it appears that a head noun can be incorporated while leaving a relative clause behind (Jensen 1977: 220). ëRelative clauses may be used with the object in an incorporated object construction, but they must follow the verb phrase as a whole, and thus must follow the subject number marker. Thus one may say: (72) Yapese Ku qu ra thuumí qachif gow ni ba maenigil. ëThey have been cutting coconut toddy which is good.í but one cannot say: (73) Yapese *Ku qu ra thuumí qachif ni ba maenigil gow. (Jensen 1977: 220)
188 ! Stanley Starosta
By the seamless analysis, thuumíqachifgow in (72) is a single word, and does not contain either a word thuumí ëcutí or a word qachif ëcoconut toddyí. If so, how can the sentence contain a relative clause ni ba maenigil modifying a noun qachif which does not exist? Remember that we are working in a constrained theory with no movement rules and no empty categories, so an ëempty headí analysis like Rosenís (Rosen 1989: 296) is not available. I think the answer to this conundrum lies in the nature of relative clauses. In a large number of Austronesian languages, a ërelative clauseí is an NP in apposition to a regent noun, and as such can appear alone as an argument of a verb. Thus, for example in Bahasa Indonesia, a sequence N yang S ëthe N that Ssí is formally: NP
or in standard lexicase notation [N]
N
NP
yang one which
Yang
S
[V]
N
N
prdc
which
ëthe N which is the thing (yang) which Ss.í Applying the same analysis to Jensenís example, the stemma would be: qu Ku
have
ra
thuumíqachifgow
they
V
plrl
toody-cut
ni
Adv
V
that which is
bamaenigil
ñtrns
N
good
N
V
ëThey two have been toddy-cutting something which is good.í
Micronesian Noun Incorporation ! 189
So, no movement and no empty categories. Rosen claims to have proven that a null head analysis is necessary for such constructions: To sum up, this paper has shown ... (iii) that stranding facts follow from independent principles of syntaxóthe existence of null arguments (pro-drop) and of null NP heads.... Further, I have shown that only a lexical account of NI can tie together the stranding facts within NI and the independently existing null-head modifiers as originating from the same grammatical source. (Rosen 1989: 316) The seamless analysis by contrast accounts for the same facts without the necessity of a powerful empty-head analysis. Of course it does not show that the NI and non-NI constructions share a common source, except in some rather tenuous etymological sense, because synchronically they donít. If the form of the NP is truly independent of the incorporated noun, then null-head modifiers (often called stranded modifiers) should be possible whether a noun is incorporated or not. The analysis assumes that null objects are independent of NI, and thus null-head modifiers should occur independent of NI. The crucial evidence for such independence requires a sentence with a simple transitive verb, and a direct object NP with a null head. Mithun 1984 presents evidence for null heads of NPs in Mohawk. (Rosen 1989: 298ñ99) The seamless analysis tells us that the modifiers are themselves NPs, which is why they can appear alone without other regent nouns. Mithunís examples are not evidence for a null-head analysis unless no less powerful alternative analysis is available. Here, as in Mohawk and in other such cases, one is. Finally, there is one mysterious construction in Kosraean which appears to be a case of some kind of stranding. ëOne main function of the classifiers is to modify a head noun in a noun phrase.... Besides this main function in a noun phrase, the classifiers have another important function, to indicate benefactors, in a verb phrase....In the first sentence lal appears before the determiner ah, but in the second sentence, it appears after ah.
190 ! Stanley Starosta
Nga molelah rais lal Sohn ah. ëI have bought Johnís rice.í Nga molelah rais ah lal Sohn. ëI have bought the rice for John.í As the position of the classifier lal in the two sentences above is different, so is the meaning of the two sentences. In the first sentence, the classifier lal is a part of the object noun phrase. By means of the classifier, the possessor of the rice is expressed. But lal of the second sentence above is not a part of the object noun phrase. Lal indicates a new possessor. That is, it means that I have bought the rice for him. (Lee 1975: 261ñ62) The classifiers indicating benefactors in a verb phrase can be used with derived intransitive verbs only when they are used with their included objects. ... Without them, the classifiers cannot be used.... Although the classifiers indicating benefactors in a verb phrase do not belong to the object noun phrase, they must agree with the head nouns of the object noun phrase or with included objects [my italicizing]. (Lee 1974: 263) This phenomenon differs from the standard examples of stranding in that the benefactive classifier phrases are not clearly part of the object noun phrases they agree with. Thus when the ëobjectí gets ëincorporatedí, they remain in the same syntactic position, and so have not been ëstrandedí. However, it resembles stranding in that the benefactive phrase continues to agree with the incorporated object after incorporation, and only appears with verbs that are either transitive or NI. An analysis like the one proposed for Yapese does not appear to be available here since the benefactive phrases do not appear to be syntactically free NPs. I leave this problem open pending a serious theoretically founded analysis of Kosraean benefactive constructions.
Conclusion In conclusion, I believe I have shown that a seamless analysis of Micronesian ënoun incorporationí accounts for the generalizations and apparent exceptions found in the data within a rigorous and constrained grammatical theory, and thus must be considered
Micronesian Noun Incorporation ! 191
preferable to analyses of the same phenomena which are more powerful and/or less explicit.
Notes 1. This paper is a revision of ëA Seamless Analysis of Micronesian Noun Incorporationí, one of the plenary papers presented at the Fifth Annual Meeting of the Austronesian Formal Linguistics Association, Honolulu, 1998. I would like to thank Byron Bender for pointing out several errors in the Marshallese examples. 2. See Zide 1976 for the contrasting view. 3. Of course an IA analysis could claim that the form b is present in kg b a, but it would then have to account for what happened to the final b in g b ësití. 4. Often apparently one constructed by an impatient orangutan. 5. In fact many widely accepted cases of ëcompoundingí also fail to meet the generally accepted definitions of ëcompoundí. 6. In this sense Lexical Phonology/ Morphology might reasonably claim an advantage in explaining the anaphoric inertness of elements within compounds (cf. Simpson 1983a) (Hendrick 1995: 324ñ25). 7. There is yet a third variant of the lexicalist position. This view tries to explain why derived nominals are not transformationally related by denying transformations the ability to analyze word internal morphology at all. I will call this view the strong lexicalist hypothesis. It is first advocated in Jackendoff (1972) and is defended most recently and vigorously by Di Sciullo and Williams (1987), Lapointe (1980, 1988) and Selkirk (1982). There are variants of the strong lexicalist hypothesis. Selkirk (1982) and Di Sciullo and Williams (1987) permit syntactic rules to refer to morphological features. Lapointe (1980, 1988) denies that syntactic rules ever refer to anything besides the features that define syntactic categoriesí (Hendrick 1995: 306). 8. N.B. Not ëChomskyaní. 9. Orthogonal classifications in terms of predicate dependents and impersonality will not be considered here. 10. The empirical advantages of this assumption have been demonstrated extensively in the lexicase literature, and will not be repeated in detail here. 11. Adapted from Springer 1993: 72. The notation ?[X] means ëtakes a word of category X as an immediate dependentí.
192 ! Stanley Starosta 12. According to Pagotto and Sugita, Marshallese does not allow incorporated objects. By my interpretation of the data, it does. 13. ear is an auxiliary verb which is the grammatical head of the sentence. The dependent verb (ka¤ in this case) is an infinitival complement which takes its actr complement index from the PAT index of the auxiliary verb (Starosta 1997). I will not formalize this part of the analysis in this paper, and will treat the non-auxiliary verb for purposes of explication as if it were the head verb of the sentence. 14. The notation [m[AGT]] indicates that an Agent is recoverable from the linguistic or extra-linguistic context. This index must be instantiated by rules of discourse, which have not been developed in the lexicase framework yet. This phenomenon is referred to in Chomskyan linguistics as ëPRO-dropí, but the lexicase analysis differs in that no empty category PRO is required or allowed. 15. This class has been identified so far only in Marshallese (this section) and Trukese (cf. Sugita 1973). 16. Rehg does not indicate any awareness of the fact that this same term is used in the analysis of Munda languages such as Sora (e.g., Starosta 1967, Starosta 1971). 17. Again, Harrison does not indicate any awareness of the fact that this same term is used in the analysis of Munda languages such as Sora. 18. This is of course also the reason why the ëincorporatedí los may not co-occur with the demonstrative o ëthatí. 19. ëSproat (1985a, 1988b) offers another explanation for facts like (44). Sproat argues that there is an independently needed principle that prevents maximal projections in general from appearing internal to an English word,... (Hendrick 1995: 325). 20. ëGenerally, a compound-member cannot, like a word in a phrase, serve as a constituent in a syntactic constructioní (Bloomfield 1933: 232). This is why you canít have very blackbirds eating rather blueberries (cf. Bloomfield 1933: 275). 21 ëThe words of the type devil-may-care are classed as words (phrasewords) because of certain other features which, within the system of the English language, place them on a level with other words.... Another is their indivisibility: ...This latter principle, namely that a word cannot be interrupted by another form, holds good almost universally...í (Bloomfield 1933: 180). 22. ëIn meaning, compound words are usually more specialized than phrases...í (Bloomfield 1933: 227). 23. Example: kh|aw kh|aahàk he/she leg-break ëHe broke his (own) leg; she broke her (own) leg.í
Micronesian Noun Incorporation ! 193 24. The use of the VP node to describe word and clitic structure instead of syntactic structure is connected with the bizarre notion that syntax has nothing to do with words: ëIt is suggested that to include the word on the same level as the other units would make the statement of the syntax very complex, and would scarcely reflect the structure of the language. For the latter reason it is felt that it is not desirable to include words (as normally defined) as units at any level in Samoan grammarí (Pawley 1961: 39ñ40). This idea in turn derives from the structuralist analysis of New Zealand Maori by Bruce Biggs. 25. This includes Chomskyan ëminimalismí as well. In a minimalist analysis, an N need not have a projection if it does not branch, but there is also nothing in the theory that would prevent branching here. An adequate analysis of NI by contrast must account for why an ëincorporatedí N can not branch. 26. Example (11) is also of this type. 27. Pagotto analyzes these examples as ëdirectional locative manner verbsí, but in SSís analysis, they are not locational, since they have no LOC complement.
References Anderson, Stephen R. 1985. ëTypological distinctions in word formationí. In Language Typology and Syntactic Description III: Grammatical Categories and the Lexicon, ed. by Timothy Shopen, pp. 3ñ56. Cambridge: Cambridge University Press. Anderson, Stephen R. 1992. A-morphous Morphology. Cambridge: Cambridge University Press. Baker, Mark C. 1988. Incorporation: A Theory of Grammatical Function Changing. Chicago: The University of Chicago Press. Biligiri, H.S. 1965. ëThe Sora verb, a restricted studyí. In Indo-Pacific Linguistic Studies, Part II: Descriptive Linguistics, ed. by G.B. Milner and Eugenie J.A. Henderson, pp. 231ñ50, Amsterdam: North-Holland Publishing Co. Bloomfield, Leonard. 1933. Language. New York: Henry Holt and Company. Chao, Yuen Ren. 1968. A grammar of Spoken Chinese. Berkeley: University of California Press. Chomsky, Noam A. 1965. Aspects of the Theory of Syntax. Cambridge, Mass.: MIT Press. Comrie, Bernard. 1978. ëErgativityí. In Syntactic Typology: Studies in the Phenomenology of Language, ed. by Winfred P. Lehman, pp. 329ñ94. Austin: University of Texas Press.
194 ! Stanley Starosta DiSciullo, A.M. and E. Williams. 1987. On the definition of word. Cambridge, Mass.: MIT Press. Donegan, Patricia J. and David L. Stampe. 1983. ëRhythm and the holistic organization of language structureí. In Papers from the Parasession on the Interplay of Phonology, Morphology, and Syntax, ed. by John F. Richardson, Mitchell Marks and Amy Chukerman, pp. 337ñ53. Chicago: Chicago Linguistic Society. Ford, Alan and Rajendra Singh. 1996. ëQuelques avantages díune linguistique débarrassée de la morpho(pho)nologieí. In Trubetzkoyís Orphan: Proceedings of the Montréal Roundtable on Morphonology. Contemporary responses, edited by R. Singh, pp. 119ñ39, 140ñ54. Amsterdam Studies in the Theory and History of Linguistic Science Series IVó Current Issues in Linguistic Theory, Volume 144. Amsterdam/Philadelphia: John Benjamins Publishing Company. Ford, Alan, Rajendra Singh and Gita Martohardjono. 1997. ëPace Påƒini: Towards a Word-based Theory of Morphologyí. American University Studies Series XIII, Linguistics, 34. New York: Peter Lang. Fromkin, Victoria and Robert Rodman. 1988. An Introduction to Language (Fourth edition). New York: Holt, Rinehart and Winston, Inc. Gibson, Jeanne D. and Stanley Starosta. 1990. ëErgativity east and westí. In Linguistic Change and Reconstruction Methodology. Trends in Linguistics Studies and Monographs 45, ed. by Philip Baldi, pp. 195ñ210. Berlin: Mouton de Gruyter. Harrison, Sheldon P. 1976. Mokilese Reference Grammar. Honolulu: The University Press of Hawaiëi. Hendrick, Randall. 1995. ëMorphosyntaxí. In Government and Binding Theory and the Minimalist Program, ed. by Gert Webelhuth, pp. 297ñ349. Oxford: Blackwell. Ho, Arlene Y.L. 1993. ëTransitivity, focus, case and the auxiliary verb systems in Yami Academia Sinicaí. Bulletin of the Institute of History and Philology. 62(1): 83ñ147. Hopper, Paul J. and Sandra A. Thompson. 1980. ëTransitivity in grammar and discourseí. Language. 56: 251ñ99. Huang, Lillian M. 1994. ëErgativity in Atayalí. Oceanic Linguistics. 33(1): 129ñ43. Jensen, John Thayer. 1977. Yapese Reference Grammar. Pali Language Texts: Micronesia. Honolulu: The University Press of Hawaiíi. Lee, Keedong. 1974. ëKusaiean verbal derivation rulesí. Ph. D. dissertation. Honolulu: University of Hawaiíi. Lee, Keedong. 1975. Kusaiean Reference Grammar. Pali Language Texts: Micronesia. Honolulu: The University Press of Hawaiíi. Masica, Colin P. 1991. The Indo-Aryan Languages. Cambridge: Cambridge University Press. Mithun, Marianne. 1984. ëThe Evolution of Noun Incorporationí. Language, 60: 847ñ94.
Micronesian Noun Incorporation ! 195 Mithun, Marianne. 1986. ëOn the nature of noun incorporationí. Language. 62(1): 32ñ37. Mohanan, K.P. 1996. ëWhere Does Morphophonology Belong? Comments on Ford and Singhí. In Trubetzkoyís Orphan: Proceedings of the Montréal Roundtable on Morphonology. Contemporary responses, ed. by Rajendra Singh, pp. 119ñ39. Amsterdam Studies in the Theory and History of Linguistic Science, Series IVóCurrent Issues in Linguistic Theory, Volume 144. Amsterdam/Philadelphia: John Benjamins Publishing Company. Ng, Siew Ai and Stanley Starosta. 1996. ëCompounding confusion in Chinese word-formationí. In NACCL: Proceedings of the 7th North American Conference on Chinese Linguistics and 5th International Conference on Chinese Linguistics Proceedings (NACCL and ICCL), Vol. II: Discourse, Historical Linguistics, Morphology, Phonology and Phonetics, ed. by Tsai-Fa Cheng, Yafei Li and Hongming Zhang, pp. 307ñ24. Los Angeles: GSIL Publications, Department of Linguistics, University of Southern California. Pagotto, Louise. 1987. ëVerb subcategorization and verb derivation in Marshallese: A localistic lexicase analysisí. Ph.D. dissertation. Honolulu: University of Hawaiíi; Microfilm MS10487, Microfilm V10487. Pawley, Andrew K. 1961. ëA scheme for describing Samoan grammarí. Te Reo (4). Proceedings of the Linguistic Society of New Zealand. 38ñ43. óóó.1970. ëGrammatical reconstruction and change in Polynesia and Fijií. In Pacific Linguistic Studies in Honour of Arthur Capell. Pacific Linguistics, Series C, no.13, ed. by S.A. Wurm and D.C. Laycock, pp. 301ñ67. Canberra: Linguistics Circle of Canberra. Rehg, Kenneth L. 1981. Ponapean Reference Grammar. Pali Language Texts: Micronesia. Honolulu: The University Press of Hawaiíi. Rosen, S. 1989. ëTwo types of noun incorporation: A lexical analysisí. Language. 65: 294ñ317. Sadock, Jerrold M. 1986. ëSome notes on noun incorporationí. Language. 62(1): 19ñ31. Singh, Rajendra and Probal Dasgupta. 1999. ëOn so-called Compoundsí. In The Yearbook of South Asian Languages and Linguistics 1999í, ed. by R. Singh, pp. 318ñ35. New Delhi: Sage Publications. Sapir, Edward. 1911. ëThe problem of noun incorporation in American languagesí. American Anthropologist n.s. 13: 250ñ82. Sohn, Ho-min, 1975. Woleaian Reference Grammar. Pali Language Texts: Micronesia. Honolulu: The University Press of Hawaiíi. Starosta, Stanley. 1967. ëSora syntax: A generative approach to a Munda languageí. Ph.D. dissertation. Madison: Department of Linguistics, University of Wisconsin. óóó. 1971. ëDerivation and case in Sora verbsí. Indian Linguistics. 32(3):194ñ206. óóó. 1988 [published in 1991]. ëA grammatical typology of Formosan languagesí. Fang-kuei Li memorial volume. Bulletin of the Institute of
196 ! Stanley Starosta History and Philology, Vol. LIX, Part II, pp. 541ñ76. Taipei: Academia Sinica. Starosta, Stanley. 1989ñ1990. ëSora combining forms and pseudo-compoundingí. In Proceedings of the Symposium on Austro-Asiatic Languages. Mon-Khmer Studies XVIII-IX. ed. by David Thomas, pp. 77ñ105. óóó. 1997. ëControl in constrained dependency grammarí. In Reconnecting Language: Morphology and Syntax in Functional Perspectives, ed. by Anne-Marie Simon-Vandenbergen, Kristin Davidse and Dirk No‡l, pp. 99ñ138. Amsterdam: John Benjamins Publishing Company. óóó. 1998. ëErgativity, transitivity, and clitic coreference in four Western Austronesian languagesí. In Case, Typology, and Grammar: In Honour of Barry J. Blake, ed. by Anna Siewierska and Jae Jung Song, pp. 277ñ306. Typological Studies in Language. Amsterdam: John Benjamins. Starosta, Stanley, Koenraad Kuiper, Siew Ai Ng and Zhiqian Wu. 1998. ëOn defining the Chinese compound word: Headedness in Chinese compounding and Chinese VR compoundsí. In New Approaches to Chinese Word Formation: Morphology, Phonology and the Lexicon in Modern and Ancient Chinese. Trends in Linguistics. Studies and Monographs, ed. Jerome L. Packard, pp. 347ñ70. Berlin: Mouton-de Gruyter. Sugita, Hiroshi. 1973. ëSemitransitive verbs and object incorporation in Micronesian languagesí. Paper presented in the First International Conference on Comparative Austronesian Linguistics, 1974óOceanic. Oceanic Linguistics XII. 12: 393ñ406. Zide, Arlene. 1976. ëNominal combining forms in Sora and Gorumí. In Austroasiatic Studies, ed. by Philip Jenner, Laurence Thompson and Stanley Starosta. Honolulu: The University Press of Hawaiíi.
Semantic Fragmentation in Word-Formation ! 197
8 Semantic Fragmentation in Word-Formation: The Case of Spanish -AZO1 Franz Rainer Es ist ... notwendig, jede einzelne Wortbildung zu individualisieren. Der Grundsatz der neueren Lexikographie ëjedes Wort hat seine eigene Geschichteí gilt auch für die Wortbildungslehre. Die Serien von Wörtern gleicher Endung oder Vorsilbe sind ëmiragesí, sie zerfallen bei mikroskopischer Betrachtung ebenso wie die Lautgesetze: [...] So mündet denn die Wortbildungslehre von selbst in das Wörterbuch ein. (Leo Spitzer. 1923. Archivum Romanicum (7) p. 198)
Rule or Analogy? The analogical theory of word-formation, taken for granted in former times, but stigmatized during the heydays of structuralism and generative grammar, has been witnessing a welcome Renaissance over the past few years.2 According to the extreme version of this theory, new complex words are not formed according to general rules but according to single models. If this were the case, the whole business of word-formation could be reduced to the one single statement, ëLook for an appropriate model and coin the new word according to it.í Most analogists, however, are less radical and claim that new words may be coined according to the model of either one single word or of a group of words. They also generally admit that in privileged cases this group may be so large that speaking of analogy or rule-based creativity amounts to more or less the same thing. Both analogists and rule-based morphologists thus generally agree on the existence of both single word-analogy and completely general patterns, but each of them conceives of the opposite end as the normal
198 ! Franz Rainer
case. No reliable statistics are available yet that would allow a principled answer to this problem. Intuitively, I would say, on the basis of my experience in a full-scale description of Spanish word-formation (cf. Rainer 1993), that there is a strong tendency in the literature to overestimate the degree of generality and productivity of word-formation rules. Most Spanish word-formation rules have such a low productivity that it was generally easy to identify the ëleader wordí or, more often, ëleader groupí of neologisms. In this contribution, the ubiquity of semantic fragmentation is presented as further evidence in favour of the analogical conception of word-formation processes. Semantic fragmentation here means that a once semantically homogeneous word-formation process is split up in the course of time into a series of different processes. I will first briefly sketch the fragmentation process of the Spanish suffix -azo (cf. also Figure 8.1),3 then describe the radial structure resulting from this process in synchrony. Two hitherto neglected processes, which will be called metaphoric and metonymic approximation, are shown to play an important role in the fragmentation of affixes. On the theoretical level, it will be argued that a Gesamtbedeutung analysis is inadequate, while the observed phenomena readily fit an analogical model.
The Fragmentation Process of -azo In the late Middle Ages, our suffix, whose diachronic relationship to the homonymous augmentative suffix -azo/a is still not definitively settled (cf. Malkiel 1959),4 meant ëblow, stroke, etc., with an xí: azotazo ëlash with a whip (azote)í, etc. The base, in the central type, denotes the instrument moved in order to execute the blow, stroke, etc., and generally some aggressive intention is involved. The productivity of this type already during the Renaissance is proved by the fact that not only prototypical instruments of aggression such as whips, rods, sticks, weapons, stones or elbows may be found as bases, but also more occasional ones such as slippers, jugs, pans, brooms, fire hooks, stools, candlesticks, books, oranges, and even pigís bladders. This central type, by the way, continues to be by far the most important and productive one.5 Among the numerous neologisms, suffice it to mention coctelazo (ABC 26ñVIIIñ1998 p. 19), said of Molotov cocktails thrown by Basque rioters,6 huevazo (El Pa·s 8ñVIIIñ1997 p. 1), of
Semantic Fragmentation in Word-Formation ! 199 AZOTAZO ëblow, etc., with xí (azote ëwhipí)
Middle Ages Renaissance
XVIII/XIX
ESPALDARAZO ëblow on xí (espalda(r)ëbackí)
ëshotí > ëaccompanying noiseí
ëblowí > ëgulp of liquourí
BROCHAZO ëbrusque movement of xí (brocha ëbrushí) AGUARDENTAZO ëgulp of xí (aguardiente ëfire-waterí)
XX
CAÑONAZO ëshot with xí (ca»©n ëcannoní)
BOCAZO ëmisfireí (boca ëblastholeí)
TROMPETAZO ëblast of xí (trompeta ëtrumpetí)
BOMBAZO ëexplosion of xí (bomba ëbombí)
am. suelazo ëtumble of xí (suelo ëgroundí)
JERINGANZO ëinjection with xí (jeringa ësyringeí)
chil.diucazo ëwarble of the diuca (a finch)í
cuartelada cuartelazo ëbarrack-putschí BOGOTAZO ëriot in xí (from Bogotá)
.....
OBISPAZO ëspectacular pol. action of xí (obispo ëbishopí)
CATASTRAZO ëspectacular increase of x-taxí (catastro ëground registerí)
Figure 8.1: Family Tree of -azo
mex. avionazo ëair-crashí (avi©n ëplaneí) PEPAZO ëcrowded electoral meeting of xí (from Pepi Montesdeoca)
200 ! Franz Rainer
eggs (huevo) thrown at local politicians, or raquetazo (La Verdad [Murcia] 6ñVIIIñ1997, p. 27), a stroke with a tennis racket (raqueta). By the early Renaissance the suffix had produced a closely related off-spring with the meaning ëshot with an xí: ca¤onazo ëcannon-shotí (ca¤˙n ëcannoní), etc. The base here is still instrumental, but it is not moved, and contact is indirect (via a projectile). There are numerous established formations of this type, as well as some neologisms: riflazo ëshot with a rifle (rifle)í, etc. Among the established formations, only balazo ëshotí (from bala ëbulletí) shows a somewhat different relationship between base and derivative, while among the neologisms we find an agent in the base-position in the following example: los ëtancazosí son tan violentos como los ëscottazosí (quoted in Nord 1983: 63) ëthe shots of Tanco are as violent as those of Scottaí (Tanco and Scotta are football players). Technical formations of the type bocazo ëmisfireí are probably due to the analogy between the barrel of firearms and a blasthole (boca). Another direct off-spring of the ca¤onazo-type seems to be constitued by the type jeringazo ëinjectioní (jeringa ësyringeí). Here there is still a pipe, but what is ejected through this pipe is no longer a projectile, but a liquid. Some recent neologisms of this kind are quoted in Lang (1990: 218): duchazo ëshower (event)í (from ducha ëshower (instrument)í, manguerazo ëshowerí (from manguera ëhoseí), and regaderazo ëshowerí (from regadera ëwatering caní). Derivatives of the ca¤onazo-type regularly have metonymic extensions referring to the accompanying noise. These, in turn, form the link to the type trompetazo ëblast of a trumpet (trompeta)í, which is set apart from the central type not only by the lack of motion of the instrument designated by the base, but also by the lack of any intention of doing physical harm. Two formations of this type from Venezuela are quoted in Tejera (1996: 58): cornetazo ëhorn (corneta) signalí and pitazo ësound produced with a whistle (pito)í. The next generation of this acoustic sideline is the Chilean formation diucazo ëwarble of the diuca (a finch)í (Oroz 1966: 286), where the base is animate and agentive in contrast to its model, the type trompetazo. The type bombazo ëexplosion of a bomb (bomba)í, is also best regarded as based on the acoustic extension of the type ca¤onazo, if it is to be considered a separate type at all (but note that there is no projectile involved, etc.). For its part, the type bombazo seems to have fathered the mexicanisms avionazo ëair crashí (avi˙n ëplaneí) and trenazo ëtrain accidentí (tren ëtrainí), where the base lacks instrumental function. Let us now come back to the central type in order to follow other sidelines.
Semantic Fragmentation in Word-Formation ! 201
By the Renaissance the central type had also come to mean ëblow on xí: espaldarazo ëslap on the back [espalda(r)]í, etc. This type probably arose through a reinterpretation of pragmatically ambiguous cases like cabezazo ëblow with the head (cabeza)í, which, when unintentionaló like when you bump your head against the dooróare more naturally interpreted as ëblows on xí. A relatively recent formation according to this type is barrigazo ëbelly (barriga) flopí. The same meaning is also expressed by piscinazo (from piscina ëswimming poolí), where the base, however, no longer denotes the body-part which receives the blow, but the place struck. The same metonymic switch may be observed in the Americanism suelazo ëtumble on the ground (suelo)í. The next sideline is mediated by the metaphoric extension from ëblowí to ëgulp of hard liquourí, as in latigazo ëlit. lash with a whip (látigo); fig. gulpí. By maintaining the general metaphoric meaning ëgulpí, but substituting the name of a strong liquour for the base, a new type, particularly well represented in Latin America (cf. Tejera 1996: 59ñ62), has been created: aguardentazo ëgulp of ìfire-waterî (aguardiente)í, etc. The case of copazo ëdrinkí (from copa ëglassí) and copitazo ëdrinkí (from the diminutive copita ësmall glassí) is somewhat different, since the base here is the container, not the alcohol. These formations are thus probably best accounted for by attributing them to the central type (or to the type jeringazo). The type brochazo ëstroke of the brush (brocha)í is quite closely related to the central type. It simply refers to some brusque movement, lacking any aggressive intention. Further instances of this type are abanicazo ëmovement of a fan (abanico)í, lengüetazo ëmovement of the tongue (lengua)í, among many others. Doublets like hisopazo ëaspersion with the aspergillum (hisopo)í and hisopazo ëstroke with the aspergillumí (on a ministrantís head, for example) strongly suggest that the ëbrusque movementí meaning and the ëblowí meaning should be kept apart. We now turn to the most recent sideline of the -azo-clan (cf. de Bruyne 1978). Probably by affix-substitution on the basis of cuartelada ëbaracks-putsch/riotí (cuartel ëbarracksí)óexploiting the near synonymy of -ada and -azo for denoting blowsóthe term cuartelazo was created in Latin America at the beginning of the 20th century. This formation then seems to have served as the model for bogotazo, the denomination of a popular riot in Bogotá in 1948. This was obviously a very useful suffixal innovation for Latin America, where many similar creations, referring either to popular riots or to military coups,
202 ! Franz Rainer
are attested in the following decades. The riots generally take as their base the name of the place where it occurred: caracazo (Caracas), cordobazo (C˙rdoba, Argentina), mendozazo (Mendoza, Argentina), etc. But other semantic relations are also attested. In Argentina, peronazo referred to a riot of Peronist inspiration; the protest meetings against Salvador Allende in the fashionable quarters of Santiago de Chile were called cacerolazos because of the noise produced by the participants by beating on casseroles (cacerola); and in Venezuela the protests of students for higher budgets, which consisted of taking the desks (pupitre) out on the streets in order to have their lessons there, were recently called pupitrazo. Coups, on the other hand, normally have the instigator as their base. The first on record is the pinochetazo of 1973: it seems that through the international stir created by Pinotchetís putsch this type has since also become acclimatized in the Iberian Peninsula, which witnessed an espinolazo (Spinola) in Portugal in 1974 and a tejerazo (Tejero) in Spain in 1981. In Latin America, the most well-known recent instance is the fujimorazo, Fujimoriís coup in Peru. Since then, this type has given rise to new off-springs in the political or journalistic jargon on both sides of the Atlantic. The alfonsinazo, President Alfons·nís sweeping electoral victory in Argentina in 1983, may be seen as a democratic pendant of a coup. Further developments are attested in Venezuela (cf. Chumaceiro 1987), where the suffix may also refer to crowded electoral assemblies; the base either designates the politician (pepazo; from Pepi Montesdeoca, around 1980) or the place (poliedrazo; Poliedro, Caracas). In the Spanish media, the suffix is used in order to designate politically spectacular actions or statements, in which case generally their authors serve as bases: a pastoral letter of Catalan bishops (obispo), e.g., requiring the right of self-determination for Catalonia was called obispazo (El Mundo 7ñIXñ 1991, pp. 2 and 7) in 1991, while in 1996 the fact that Vidal-Cuadras, then leader of the Partido Popular in Catalonia, called the Catalan nationalists insidious and reactionary was referred to as vidalazo (Tiempo 19ñVIIIñ1996, p. 44). Even more recent seems to be the use of -azo in order to refer to heavy increases in prices or taxes. In Spain, the ancestor seems to have been the catastrazo of 1990, a drastic increase in the poll-tax (from catastro ëcadastreí). When in 1991 local telephone fees rose by 100 per cent, this event was called telefonazo. In 1998, another rise in telephone fees is called tarifazo (ABC 17ñVIIIñ 1998, p. 22; from tarifa ëtariffí). The exclusion in 1993 of a thousand
Semantic Fragmentation in Word-Formation ! 203
medicines from those paid by Social Security was called medicamentazo, a word still very much in vogue. Similarly, in 1996 the idea of making Spaniards pay 100 ptas for every receta was called recetazo (El Pa·s 14ñ IXñ96, p. 1). While in politics -azo has become specialized in price rises, the language of advertisement has adopted this same suffix for coining catchy expressions for spectacular price reductions. As far as I can see, this usage is only attested for Venezuela (cf. Chumaceiro 1987: 364): cauchazo (concerning tyres [caucho]), cortinazo (concerning curtains [cortina]), etc. In Latin America, the suffix has been extended even further. When in 1994, Mexico was on the brink of financial collapse, facetious tequilazo, derived from tequila, the national hard drink of Mexico, was deemed to be an appropriate designation. In Columbia a spectacular defeat of the football team in Berne, Switzerland was called bernazo, and in Argentina the victory over Brazil was recently referred to as argentinazo. In Chile, a festivity where champagne (champán) is consumed is called champanazo, and in Mexico the unexpected birth of a late-comer is a santanazo (from Santa Ana, Saint Ann, who is said to have given birth to Mary at an advanced age). Many more such offshoots could certainly be collected in Latin American dictionaries and newspapers, and it is hoped that the lacunae of this description will stimulate colleagues from different Latin American countries to publish more detailed accounts of their respective varieties. In the absence of such accounts (but see Chumaceiro 1987 and Tejera 1996 for Venezuela) and reliable dictionaries of neologisms for the Spanish language in its different varieties,7 the exact interrelations between these more recent types as well as their diffusion and number are difficult to establish exactly. The same, by the way, is also true for the early stages of the evolution of -azo, where the existence, in a remote future, of a reliable thesaurus of the Spanish language will certainly force the introduction of some modifications. This cursory treatment thus cannot of course do justice to the real complexity of the subject matter, but it is sufficient for our present purposes.
Gesamtbedeutung vs Polysemy What kind of evidence is there that -azo must be split up into a number of types, as we have tacitly assumed up to now? One could point to
204 ! Franz Rainer
the temporally delayed appearance of the various types, to differences in register and regional distribution, or to the existence of doublets such as trompetazo ëblast of a trumpetí and trompetazo ëblow with a trumpetí, etc. But the main argument, to my mind, is that only the fragmentation hypothesis allows one to delimit the domain of our suffixe(s) in a realistic way. Valdivieso and Pandolfi (1982) have tried to derive all uses of -azo as contextual variants of an abstract meaning ëphysical impactí. Such a hypothesis, however, suffers from several serious defects. I will not belabour the fact that they never explicitly state how to derive contextual variants. What is even more problematic is that their abstract meaning is both too broad and too narrow to predict the actual range of attested and possible words in -azo. On the one hand, ëphysical impactí does not cover the type obispazo and several others (which, by the way, they do not seem to be aware of). This could be obviated by making the abstract meaning even more abstract. What is more serious is that their abstract meaning predicts the possibility of a certain range of formations which simply are not acceptable. Thus, the meaning ëblow given by a boxerí certainly constitutes a ëphysical impactí and, furthermore, is pragmatically highly plausible, but nevertheless formations like boxeadorazo, pugilazo or pugilistazo (from boxeador, púgil, pugilista, all meaning ëboxerí), with an agent in base-position, are decidedly odd (they are conceivable augmentatives, of course). Even within the various types, not just any noun can occupy the place of the base. Thus Gauger (1971: 25) observed that the type espaldarazo is restricted to body-parts, while a pragmatically plausible derivative like mesazo ëblow on a tableí is odd. This restriction to body-parts is characteristic only for the type espaldarazo, and thus could not be encoded at a more general level. In the most recent developments of the suffix one may gain the impression that anything goes. Even here, however, an abstract meaning like ësurprising event that has to do something with the baseí will turn out to be too general. Several examples suggest that we have to do with relatively local analogies of the type medicamento: medicamentazo = receta: x, analogies that have taken different paths in different varieties of Spanish. Since most of these formations are creative journalistic occasionalisms and not just instances of a well-established usage, speakerís judgements unfortunately are of little help for assessing these two contrasting hypotheses. But on the whole one may
Semantic Fragmentation in Word-Formation ! 205
conclude that the hypothesis of an abstract meaning fails to account for the systematic non-existence of derivatives for some pragmatically plausible semantic configurations and for the existence of arbitrary, type-specific restrictions. In order to explain gaps such as *boxeadorazo or *mesazo, works in the spirit of Valdivieso and Pandolfiís often resort to Coseriuís notion norm (cf. Coseriu 1952), which is a rough synonym of established usage. Instead of solving the problem, however, such a move only serves to disguise it by lumping together such non-established and unacceptable formations with non-established but readily acceptable ones such as coctelazo, huevazo, raquetazo, which we have already introduced above, or paraguazo ëstroke with an umbrella (paraguas)í, tiestazo ëblow with a flower-pot (tiesto)í, etc. Only for the latter group does it make sense, to my mind, to speak of possible words, while creations like boxeadorazo or mesazo are merely imaginable, i.e., conceivable results of future extensions of our suffix. Should they ever come into existence as types, this would constitute a change not just in established usage, but in the linguistic system, defined here as that level where the set of possible words is defined intentionally and where from all meanings can be derived by independently motivated mechanisms (cf. Rainer 1989). Even though the distinction between possible and imaginable words admittedly is not razorsharp, it is of crucial importance to bear it in mind in order to come to grips with the problem of the demarcation of domains of wordformation rules in accordance with speakersí judgements. The flaws of Valdivieso and Pandolfiís analysis are shared, to a greater or lesser degree, by a whole tradition of European structuralist studies on morphological semantics (cf. also Taylor 1989: 142ñ 44), from Jakobsonís 1936 Gesamtbedeutungen of the Russian cases8 up to Coseriu-style analyses of Spanish word-formation like Laca (1986) or Staib (1988). These structuralist studies are further vitiated by the assumption that the abstract meanings must enter into some semantic system based on opposition, if Bybee (1986) is right that ëthere is no psychological organizing principle of language that tells all the (grammaticalóand a fortiori word-formational; F.R.ó morphemes) in particular semantic domains to get into slots and arrange themselves for contrast and mutual exclusivityí (p. 20). Works akin to this Gesamtbedeutung-tradition have also appeared in the generative school, e.g., Botha (1988). Though this study is not
206 ! Franz Rainer
subject to the last criticizm and is more explicit about the independent mechanisms putatively responsible for the derivation of the contextual variants, it is not evident that it will stand the central test of empirical adequacy, i.e., avoid the problem of overgeneration. Having circumnavigated the Scylla of Gesamtbedeutung, how are we to avoid the Charybdis of pure homonymy? I would like to argue here that related affixes of our type are best represented in a synchronic grammar of Spanish as a radial structure9 (essentially the same assumption, by the way, seems to be implicit in Gaugerís analysis of our suffix). A radial structure generally, but not necessarily, consists of a central memberóthe prototypeósurrounded by a bunch of less central members. A less central or peripheral member is not predictable from the central member, but it is motivated by it, i.e., there is some independently existing link that makes sense of their relationship (cf. Lakoff 1987: 448). In our case, the links are either metaphorical or metonymical, as we have seen. Thus, e.g., diucazo is metaphorically linked to the type trompetazo, which in turn is also metaphorically linked to the metonymic acoustic extension of the type ca¤onazo, which is again metaphorically linked to the central type azotazo. In other cases, such as suelazo and many others which we have seen, the link is based on the relation of contiguity between the traditional type of base and the innovative one. Some of these metaphorical and metonymical links recur elsewhere in the lexicon, but some are only ëindependently motivatedí in as much as they are possible metaphors/metonymies. The radial structure by and large reflects the single steps of the diacronic fragmentation process, but this need not necessarily be so. Where the diachronic relationships are no longer apparent to the speaker, a peripheral type may always be relinked. Thus it is conceivable, e.g., that the type catastrazo, which from an historical perspective seems to be an off-spring of the bogotazotype, is directly referred by speakers to the central type and reinterpreted as a metaphoric blow. The analysis of -azo as a radial structure nicely accounts for the observed centrality gradience: the peripherality of a certain type may roughly be defined as a function of the number of intervening links. This analysis also immediately explains why the various members do not share any defining featuresó[+event] is common to all, but extends far beyond the structureóthough there are a lot of family resemblances, as can be seen in Table 8.1.
Semantic Fragmentation in Word-Formation ! 207 Table 8.1
instrumental
ñ ñ ñ ñ ñ/+ ñ ñ ñ ñ + + ñ + +/ñ ñ ñ
+ ñ ñ ñ ñ ñ ñ ñ ñ ñ ñ ñ ñ ñ ñ ñ
? ñ + ñ ñ + +/ñ + ñ ñ ñ + ñ ñ ñ +
ñ ñ ñ ñ ñ/+ ñ ñ ñ ñ + ñ ñ + ñ/+ ñ ñ
ñ ñ ñ + +/ñ ñ ñ ñ ñ ñ + ñ ñ ñ/+ + ñ
moved
liquid
ñ ñ + ñ ñ + ñ + ñ ñ ñ/+ ñ ñ ñ ñ ñ
locative
animate
+ + + ? ? + + + ñ ? + + ñ ? + ?
agentive
aggressive intention
0/4 + 0/2 + 170/18 + 3/1 + 1/5 + 4/1 + 14/12 + 11/4 + 0/5 + 0/1 + 14/1 + 2/3 + 0/10 + 0/5 + 0/2 + 9/5 +
Base
physical impact
aguardentazo avionazo azotazo bocazo bogotazo bombazo brochazo ca¤onazo catastrazo diucazo espaldarazo jeringazo obispazo pepazo suelazo trompetazo
event
ratio of attested examples to neologisms
Derivative
+ + + ñ ñ ñ + ñ ñ ñ ñ ñ/+ ñ ñ ñ ñ
Metaphoric and Metonymic Approximation There remains to be answered the question of how this analysis supports the analogical theory of word-formation. The answer is: because this theory does not stamp semantic fragmentation as an anomaly, but rather leads one to expect that it should be as pervasive as it is (cf. Rainer 1993: 46ñ47). An analogical theory is necessarily holistic, i.e., the creation of a neologism amounts to searching for a modelóa single word, a group of words or even a rule-like pattern in some privileged casesóand exchanging the value of the base (or affix). The meaning of the derivative thus does not have to be assembled on the basis of the meanings of the constituents (base plus affix) and the meaning of a rule, but is derived from the lexical meaning of the leader word or leader group.
208 ! Franz Rainer
In this search-process, speakers may content themselves with an approximation rather than with a full match, especially if the model and the concept to be expressed are related metaphorically or metonymically. In the numerous cases of metonymic approximation the general frame presupposed by a type of formation remains unchanged and only another cognitively salient element of this frame is chosen as a base. In the frame FALLING DOWN (of a person), e.g., the most salient elements are the three entities person, body-part hurt and place struck, the event itself, and possibly some salient circumstance such as a banana peel. The existence in Spanish of the type espaldarazo with the meaning ëblow on xí makes us expect speakers to choose the body-part hurt as the base of a new formation for the concept SINGLE ACT OF FALLING DOWN and thus to coin culazo (from culo ëbacksideí) or some more refined synonym. Instead, we observe that Latin-American speakers have chosen the place struck in suelazo (from suelo ëgroundí). We will never know exactly the reason for their choice: they perhaps tried to avoid the four-letter word as a base, or they found the body-part less suited since different body-parts may be involved depending on how exactly one falls down, while the result is always unique in the sense that we lie on the ground (suelo). Whatever the reason may have been in this particular case, one can observe that metonymic switching of this kind is relatively frequent in word-formation and should be recognized as a major determinant of the tendency towards affixal fragmentation. Metaphoric approximation may be exemplified by the relationship between diucazo and the type trompetazo. It seems that the coiner of diucazo, judging the generic canto ësong of birdsí to unexpressive, preferred to take as a model the pattern SHARP SOUND PRODUCED BY A MUSICAL INSTRUMENT illustrated by trompetazo and similar formations, substituting the finch for the musical instrument on the bases of relatively straightforward similarities. Though both metaphoric and metonymic approximation are ubiquitous in our data and in the evolution of word-formation in general, they do not seem to have received, until now, the attention they deserve, or, what is more, a name. Another source of fragmentation, on the contrary, is well established in the traditional literature on the history of word-formation.10 When the models are single words which have developed peculiarities setting them apart from other formations of the same kind, these peculiarites may be passed on to the neologisms formed after them and so give rise to a new type. The 1991 telefonazo, e.g., not to be
Semantic Fragmentation in Word-Formation ! 209
confused with the established telefonazo ëquick phone-callí, was certainly modelled in direct analogy to the ominous catastrazo of 1990, taking over most of its denotative and connotative peculiarities. Local analogies of this kind, which are far more common than students of word-formation seem to think, make an analogical account unescapable anyway at least in these cases.
Notes 1. This article is a revised version of my contribution to the International Phonology and Morphology Meeting held in Krems (Austria) in 1992. 2. Cf. Motsch 1977, Bybee 1985, Pinker 1989, Skousen 1989 and 1992, Becker 1990, Moder 1992 or the contributions in Rivista di linguistica 7/2 (1995), 211ñ72, among others. 3. As far as the examples are concerned, I freely draw upon dictionaries and the previous literature on the subject. Explicit references to the sources of neologisms are normally given only for examples hitherto not mentioned in the literature and drawn from my own corpus. 4. Malkielís etymology of -azo does not strike me as particularly plausible. One could perhaps see the origin of the blow-meaning in a reanalysis of azotazo, derived from azote, which originally meant ëwhipí and then metonymically also ëlash with a whipí. According to this conjecture, azotazo in the sense of ësevere lash with a whipí would first have been formed as an augmentative on the base of the meaning ëlash with a whipí, but then reanalyzed as a direct derivative of the original meaning of azote, viz. ëwhipí. By such a reanalysis the meaning ëblowí would have been attributed directly to the suffix -azo. This conjecture derives some additional plausibility from the fact that whips were probably the prototypical instrument of aggression in the Middle Ages. The poor state of Spanish historical lexicology, unfortunately, does not allow us to assess the chronological plausibility of this conjecture. 5. For indications about the quantitative productivity of our types cf. Table 8.1. 6. Han asado su vivienda a coctelazo limpio. 7. Alvar Ezquerra 1994 is limited to Spain and not particularly informative on our subject. 8. Cf. the critique in Wierzbicka 1980.
210 ! Franz Rainer 9. The source of inspiration, of course, is Lakoffís 1987 ëradial categoryí. But I avoid the term ëcategoryí, since I follow Kleiber (1990: 174ñ75) in considering the different nodes of a radial structure not as parts of one single category, but as independent, though related categories. 10. Cf., e.g., Wartburg (1923: 109ñ10): ëIn der Tat ist die Suffixlehre in hohem MaBe in der Wortgeschichte verankert. Ein mit einem Suffix gebildetes Wort kann eben im Verlaufe der Zeit Bedeutungsveränderungen durchmachen, die es von seinem Ursprung weit abführen und sodann in seiner neuen semantischen Funktion wieder als Ausgang neuer Wortbildungen dienen. So erklären sich auch die zahlreichen Spaltungen von Suffixen.í
References Alvar Ezquerra, Manuel. 1994. Diccionario de voces de uso actual. Madrid: Arco/Libros. Becker, Thomas. 1990. Analogie und morphologische Theorie. München: Fink. Botha, Rudolf. 1988. Form and Meaning in Word-formation: A Study of Afrikaans Reduplication. Cambridge: Cambridge University Press. Bruyne, Jacques de. 1978. ëAcerca del sufijo -azo en el espa¤ol contemporáneoí. Iberoromania. 8: 54ñ81. Bybee, Joan. 1985. Morphology: A Study of the Relation between Meaning and Form. Amsterdam: Benjamins. óóó. 1986. On the nature of grammatical categories: A diachronic perspective. ESCOL 1985. Proceedings of the Second Eastern States Conference on Linguistics. Columbus: Ohio State University, pp. 17ñ34. Chumaceiro, Irma. 1987. ëAlgunos aspectos de la sufijaci˙n en el espa¤ol de Venezuelaí. In Actas del I Congreso Internacional sobre el Espa¤ol de América, ed. by Humberto L˙pez Morales and Mar·a Vaquero, pp. 361ñ 71. Madrid. Coseriu, Eugenio. 1973 [11952]. ëSistema, norma y habla. Teor·a del lenguaje y lingü·stica generalí. Cinco estudios, pp. 11ñ113. Madrid: Gredos. Gauger, Hans-Martin. 1971. Studien zur spanischen und französischen Wortbildung. Heidelberg: Winter. Jakobson, Roman. 1971 [11936]. ëBeitrag zur allgemeinen Kasuslehre. Gesamtbedeutungen der russischen Kasusí. In Roman Jakobson. Selected Writings. Vol. 2: Word and Language, pp. 23ñ71. The Hague-Paris: Mouton. Kleiber, Georges. 1990. La Sémantique du Prototype. Paris: Presses Universitaires de France.
Semantic Fragmentation in Word-Formation ! 211 Laca, Brenda. 1986. Die Wortbildung Als Grammatik des Wortschatzes: Untersuchungen zur spanischen Subjektnominalisierung. Tübingen: Narr. Lakoff, George. 1987. Women, Fire, and Dangerous Things: What Categories Reveal About the Mind. Chicago: University of Chicago Press. Lang, M.F. 1990. Spanish Word Formation. London: Routledge. Malkiel, Yakov. 1959. ëThe Two sources of the Hispanic Suffix -azo, -açoí. Language. 35: 193ñ258. Moder, Carol L. 1992. ëRules and analogy: Explanation in historical linguisticsí. ed. by Garry W. Davis and Gregory K. Iverson, pp. 179ñ91. Amsterdam-Philadelphia: Benjamins. Monge, Félix. 1972. Sufijos espa¤oles para la designaci˙n de ëgolpeí. Homenaje a Francisco Ynduráin, pp. 229ñ47. Zaragoza. Motsch, Wolfgang. 1977. ëEin Plädoyer für die Beschreibung von Wortbildungen auf der Grundlage des Lexikonsí. In Perspektiven der Wortbildungsforschung, ed. by Herbert E. Brekle and Dieter Kastovsky, pp. 180ñ202. Bonn: Bouvier. Nord, Christiane. 1983. Neueste Entwicklung im spanischen Wortschatz. Rheinfelden: Schäuble. Oroz, Rodolfo. 1966. La lengua castellana en Chile. Santiago de Chile: Ed. Universitaria. Pinker, Steven. 1989. Learnability and Cognition: The Acquisition of Argument Structure. Cambridge, Mass.: MIT Press. Rainer, Franz. 1989. ëDas Präfix neo- im Italienishen und in anderen europäischen Sprachení. Italienisch. 21: 46ñ58. óóó. 1993. Spanische Wortbildungslehre. Tübingen: Niemeyer. Skousen, Royal. 1989. Analogical Modeling of Language. Dordrecht: Kluwer. óóó. 1992. Analogy and Structure. Dordrecht: Kluwer. Staib, Bruno. 1988. Generische Komposita. Funktionelle Untersuchungen zum Französischen und Spanischen. Tübingen: Niemeyer. Taylor, John R. 1989. Linguistic Categorization. Prototypes in Linguistic Theory. Oxford: Clarendon. Tejera, Mar·a Josefina. 1996. ëGolpes, balazos, explosiones, impactos f·sicos y sentidos metaf˙ricos de los sufijos -ada, -azo y -˙n en el espa¤ol de Venezuelaí. Bolet·n de Lingü·stica. 11: 47ñ75. Valdivieso, Humberto and Ana Mar·a Pandolfi. 1982. ëEstructura semántica de -azoí. Revista de Lingü·stica Te˙rica y Aplicada. 20: 67ñ81. Wartburg, Walther von. 1923. Review of Ernst Gamillscheg and Leo Spitzer. Beiträge zur romanischen Wortbildungslehre. Zeitschrift für Romanische Philologie. 43: 109ñ15. Wierzbicka, Anna. 1980. The Case for Surface Case. Ann Arbor: Karoma.
212 ! Robert R. Ratcliffe
9 Towards a Universal Theory of Shape-invariant (Templatic) Morphology: Classical Arabic Re-considered Robert R. Ratcliffe Overview In comparison with familiar European languages, the so-called ërootí and ëpatterní morphology of Arabic (and Semitic languages generally) has always appeared idiosyncratic to Western linguists. In the context of modern formal linguistics perhaps the most influential attempt to integrate Arabic morphology into a universal linguistic framework is the autosegmental morphology hypothesis of McCarthy (1979, 1981). McCarthy offers a reanalysis of the ërootí and ëpatterní morphology of Arabic (and by extension other Semitic languages) in terms of linguistic principles of widespread applicability, specifically according to the principles of autosegmental phonology as were proposed to govern tone and vowel harmony systems. Subsequent applications of the autosegmental morphology hypothesis, or of its successor, the Prosodic Morphology Hypothesis (McCarthy and Prince 1990a, 1990b, 1995), to languages other than Arabic, both Semitic (Syriac, Hebrew, Amharic) and non-Semitic (Tashlhyt Berber, Ibibio, Yawelmani Yokuts, Sierra Miwok, Choctaw) has had two interesting results: first, it has become apparent that non-concatenative morphological patterns formally similar to those of Arabic are found outside of Arabic and Semitic languages. Second, however, adaptation of the theory to account for the data of these languages has often required modification or rejection of one or more of the key principles of the theory. Frequently suggested modifications have taken the form, ëUnlike in Arabic, in language X, such and such a principle is not (or is) motivatedí. Thus, Arabic morphology is again starting to appear idiosyncratic.
Classical Arabic Re-considered ! 213
In what follows I will argue that this apparent idiosyncrasy is indeed only apparent. I will show that several of the most significant modifications motivated on the basis of data from languages other than Arabic are equally motivated on the basis of the Arabic data upon which the theory was originally developed. While this will force us to reject some of the originally hypothesized universals, it will allow us to state a new set of universal properties of shape invariant morphology on the basis of a now much larger body of empirical evidence. The key points I will argue for are the following: (1) The minimum unit of organization in the Arabic lexicon is the word, rather than the ëconsonantal rootí. More specifically, Arabic words are not uniquely reducible to combinations of roots and patterns, and for this reason as well as others regularities in the lexicon can be more adequately stated in terms of relationships among words rather than in terms of combinations of morphemes. The arguments made in favour of word-based rather than consonant-root based morphology in languages such as Modern Hebrew (Bat-el 1994), Yawelmani Miwok (Archangeli 1988, 1991) or English (Anderson 1992) are equally valid for Classical Arabic. The consonantal root continues to play a role in our analysis but as an intermediate form extracted from a word in the course of a derivation. This extraction is made possible by universal properties of phonological structure rather than by segregation of vowels and consonants onto separate morpheme tiers on a language-particular basis. (2) Not all of Arabic morphology can be described in terms of fixed syllabic/vocalic patterns or templates. Rather, templatic morphology is found only in derived words. This is the same conclusion which Archangeli (1988) reaches with regard to the distribution of templatic morphology in Yawelmani and is essentially, I believe, the same point intended by Akinlabi and Urua (1993) in arguing that lexical roots are not footed in Ibibio. Further, not all derived words in Arabic conform with a shape-invariant output pattern. Many derivations which have traditionally been analyzed as involving a shape invariant template (e.g., many of the ëbinyanimí or derived verb stems) can be more economically explained in terms of affixation and/or apophony.
214 ! Robert R. Ratcliffe
(3) While the shape invariant syllabic/vocalic template appears often to be the only indicator of a morphological function in Arabic, it is also sometimes the case that the template is a meaningless stem variant associated with particular affixes. Thus the sharp distinction assumed to exist (Archangeli 1988 [1984], Noske 1985, Smith 1985) between the way templatic morphology functions in Semitic languages (as a morpheme in itself) and non-Semitic languages (as stem variants associated with affixes) is in fact not so sharp. (4) Filling of empty positions in templates is not governed by a universal directionality parameter (either left to right as proposed by McCarthy [1979, 1981] or edge-in as proposed by Yip [1988]). Morphologically motivated spreading of segments in Classical Arabic is not analogous to tonal spreading in tone languages (if it is indeed the case that tones spread to toneless segments according to a default directionality parameter). Rather, the only productive morphologically motivated spreading process affecting segments in Arabic is analogous to phonologically motivated segmental spreading as found in compensatory lengthening and related phenomenaóthat is spreading of a consonant leftward or a vowel rightward to fill an empty coda position. Empty onsets are filled by a default consonant. A substantial body of evidence from a number of typologically and genetically diverse languages allow us to conclude that these are the true universal defaults for filling empty positions in quantity-based and templatic morphology: coda-filling by local spreading, onset filling by a default consonant. One of the major reasons why Arabic has, and continues to appear idiosyncratic is that the data of Classical Arabic is filtered through a rich and sophisticated tradition of grammatical analysis. In the linguistic literature, one often sees statements to the effect that root and pattern morphology is found in Semitic languages, or that Semitic languages have root and pattern morphology. A more accurate formulation would be that the data of Semitic languages have for centuries been analyzed in terms of a grammatical theory, in which the notions of (consonantal) roots and (vocalic/syllabic) patterns play a central role.1 McCarthyís autosegmental morphology, while formalizing these notions in more explicit and universal terms, essentially takes for granted the traditional analysis. This more explicit
Classical Arabic Re-considered ! 215
formalism has, however, made it apparent that the root and pattern (or now root and template) grammar is in many respects inadequate in terms of modern criteria of economy and descriptive adequacy. Thus while the arguments we will develop under heading 1 and 2 are explicitly formulated to address positions taken by contemporary researchers in the tradition of autosegmental morphology, these positions are ultimately inherited from the tradition of medieval Arabic grammar. In the course of our analysis we will also take up the Prosodic Morphology Hypothesis (PMH) of McCarthy and Prince (1990a, 1990b, 1995) and McCarthy (1993). The core of the PMH is that ëtemplates are defined in terms of authentic units of prosody: the mora (m) syllable(s), foot (F), and prosodic word (W) and so oní, (McCarthy and Prince 1990: 209). Two of the key components of the PMH, will be considered and rejected in the course of our analysis: (1) The hypothesis that morphologically defined templates necessarily correlate with units of prosody higher than the syllableói.e., prosodic feet and (2) the hypothesis that anomalies in the derivational patterns of (what we might loosely call) ëvery shortí and ëvery longí words is a function of independent constraints on minimality (bimoraic) and maximality (bisyllabic) imposed on underived words. The first hypothesis is, as McCarthy (1993: 188) notes, easily falsified and in fact, as we will show, false. There is no necessary correlation between morphologically defined templates and prosodic feet in Classical Arabic. For the second hypothesis, we will show that such constraints are both redundant and descriptively inadequate. The anomalies associated with ëshortí and ëlongí words in Classical Arabic are a direct consequence of templatic morphology itself, and follow simply and naturally from the premise that template satisfaction is called into play only in derived environments. At the same time three components of the PMH remain wellsupported by data from Arabic and other languages. (a) The definition and representation of morphological templates in terms of syllables and morae is clearly an improvement over the older CV notation, which effectively obscured the syllabic nature of the template by describing syllable structure in terms of feature values [segmental] = C, [syllabic] = V. This proposal follows from the work of Lowenstamm and Kaye (1986). (b) The principle of prosodic circumscription is motivated on the basis of infixation phenomena from a variety of languages. (c) Affixation of empty timing material, moraic
216 ! Robert R. Ratcliffe
affixation, is clearly motivated on the basis of reduplication phenomena from a number of languages. The last two principles together open the way for a solution to a number of problems in the older autosegmental morphology hypothesis. However, McCarthy and Prince introduce these principles into the analysis in an essentially ad hoc fashion to accommodate data which does not conform with various predictions of the PMH. If we explore the full implications of allowing these principles into the analysis, it becomes possible to view Arabic morphology in a radically new way: as a system in which infixation and apophony processes are prominent and productive, while shape-invariant or templatic morphology plays a marginal role.
The Consonantal Root In the tradition of Arabic lexicography, words are listed under the heading of a consonantal rootóusually of three consonants. This organizational principle to some extent simply reflects facts of orthography (nucleus vowels are not written in Arabic consonantal script). But assuming that the notion does in fact reflect some linguistically significant generalization about Arabic, what sort of generalization is it? How should the concept be interpreted within a universal linguistic theory? The answer given by McCarthy (1979) is that the consonantal root is a morpheme in the structuralist senseó a minimal signówhich happens to be linearly discontinuous. Under this analysis, words in Arabic are formed exclusively by combinations of discontinuous morphemesóconsonantal roots, syllabic templates, and vowel melodies. McCarthy argues that the facts of Arabic can thus be interpreted consistently with Halleís (1973) theory of the mental lexicon as a list of morphemes as opposed to Aronoffís (1976) theory of the lexicon as a list of words. Recent combinatorial or word-syntax (and hence morpheme-based) approaches such as those of Lieber (1995) or Stonham (1994) have adopted McCarthyís analysis of Arabic and consequently the assumption that consonants and vowels may be separated onto different morpheme tiers on a language particular basis. The major objection to this analysis is that it achieves a unified account of diverse phenomena in one area of language (morphology) at the expense of imposing an underlying and otherwise unmotivated diversity in another area (phonology). In order to achieve
Classical Arabic Re-considered ! 217
a unified theory of morphology as universally combinatorial, it is necessary to abandon a unified analysis of a simple and seemingly universal pattern in phonology: namely that well-formed words in virtually all languages involve systematic linear alternation between segments of relatively low and relatively high sonority (consonants and vowels, respectively). A morpheme-plane based analysis explains differences in the types of morphological operations found in various languages by positing that this single surface phonological structure in fact reflects three or more different underlying structures.2 Noske (1985: 344), for example, distinguishes three types of languagesólanguages like Arabic in which consonants and vowels are not underlyingly linked to a skeleton, languages like English and Dutch in which planar segregation is assumed to exist for the sake of consistency but is redundant since vowels and consonants are obligatorily linked underlyingly, and languages like Yawelmani in which some segments are obligatorily linked. McCarthy (1989: 88) likewise assumes a complex typologyólanguages like English in which ëplanar C/V segregation, together with template morphology, would be impossibleí, languages like Arabic and other Semitic languages where a surface word like kataba ëhe wroteí is derived from separate lexical entries comprising a vowel -a- and k-t-b on separate planes, and languages like Yawelmani in which a surface word is derived from a complex lexical entry consisting of vowels and consonants underlyingly represented on separate planes. Thus syntax/ morphology is made consistent at the cost of making phonology completely idiosyncratic. This is an uneconomical trade-off since the superficial diversity in morphology has to be necessarily acknowledged on some level, while the proposed underlying diversity in phonology is not necessary to explain the phonological facts but is only required to impose conformity with the theory. That is, it represents a case of postulating entities beyond necessity. Of course the need for proposing this typological diversity is the fact that the ëconsonantal rootí as morpheme hypothesis has largely proved unworkable for other languages which exhibit nonconcatenative morphology similar to that found in Arabic. For example, research into the templatic morphology of the California Penutian languages Yawelmani (Archangeli 1988 [1984], 1991, Noske 1985) and Sierra Miwok (Goldsmith 1990: 83 ff., Smith 1985) has shown that in these languages morphological rules have to distinguish consonants and vowels although there is no motivation for
218 ! Robert R. Ratcliffe
assuming that the vowels and consonants of a word are separate morphemes. In Archangeliís autosegmental treatment (1984/1988, 1991) of Yawelmani, this observation is encoded by listing consonants and vowels on separate tiers, which are not, however, morpheme tiers.3 Research into the templatic morphology of languages genetically related to and typologically similar to Arabic has also tended to dispense with the notion of consonantal roots as morphemes. Dell and Elmedlaoui (1992: 99), for example, note, ëIn Imdlawn Tashlhyt Berber there are no reasons to assign vowels and consonants to different morphemes as there are in Classical Arabicí. For Modern Hebrew Bat-el (1994: 575) argues that, ë... the consonantal root does not exist as an independent unit in Modern Hebrewí. Closer to home Heathís (1987: 12) analysis of Moroccan Arabic is not root based: ëThe model I will use for M[oroccan] C[olloquial] A[rabic] dispenses with abstract root representations and with the various tiers of McCarthyís C[lassical] A[rabic] model. Underived noun, verb, and other stems have a simple linear representation ... Ablaut derivations are basically produced by mapping/projection onto an output template....í Thus, for languages very similar in morphological structure to Classical Arabic, two alternatives to mapping of roots from the lexicon to surface templates have been suggestedóeither consonantal roots are extracted from words as part of a process of derivation (Bat-el 1986, 1989) or more simply input words are mapped directly to output templates (Heath 1987). In Bat-elís (1986, 1989) analysis, the consonantal root still has a role to play in the morphologyóbut as an intermediate form abstracted during a process of derivation rather than as an underlying form. For CA as well there are a number of empirical problems with the view that words are exhaustively reducible to combinations of roots, vowel melodies, and patterns. First, in underived nouns CVCCun and verbs yaCCVC the quality of the vowel is not predictable and may be any of the three short vowels in the language: (1) ya-Œr ib-u ya-ktub-u ya-òr ab-u
ëhití ëwriteí ëdrinkí
q i r d-un rumh-un k a l b-un
ëmonkeyí ëspearí ëdogí
Listing these words as words with the quality of the vowel simply specified is the most economical way to encode this fact.4 In a consonant root-based analysis, the stem vowel and the string of consonants have to be treated as separate morphemes and each word has
Classical Arabic Re-considered ! 219
to be derived from its ërootí by a nonce rule. Further, since the stem vowel must thus be interpreted as a separate morpheme, the analysis makes it necessary to define a morpheme as a set of phonological elements (a set of short vowels) whose form ranges over all the possible realizations of that class of phonological elements in the language and whose meaning or function is empty (or equivalent to ëunderived formí). Second, a root-based analysis obscures regularities which are only statable in terms of relationships between fully formed words. For example, Hammond (1988) draws attention to plural forms like those in (2) which differ consistently from their respective singulars by the presence of a long /aa/ vowel in the second syllable. In a rootbased analysis, each of the plural patterns (CiCaaC, CaCaaCiC, CaCaaCiiC, CawaaCiC, CaCaa iC, maCaaCiC) has to be listed as independent allomorph which combines directly with a root. The fact that various features of the singular word (e.g., the /m/ of the prefix mi-, the quality of the vowel of the last syllable) are consistently carried over into the plural cannot be accommodated in the theory. (2) sg. kalbun daftarun Œamiirun xaatamun xaataamun ¶unduuqun mif t aahun
pl. k i l aa bun da f aa tiru Œamaa iru xawaa timu xawaa tiimu ¶an aa diiqu maf aa t iihu
ënotebookí ëpronouní ësealí ìî ëboxí ëkeyí
In order to account for cases like those in (2), McCarthy and Prince (1990a, 1990b) introduced a principle of prosodic circumscription, reiterated in McCarthy (1993: 189): Typically a morphological operation like affixation is applied to a morphological category like root, stem, or word to give a prefix or suffix of the usual sort. Under prosodic circumscription, though, a morphological operation is applied to a prosodically delimited substring within the morphological category, often yielding some sort of infix. Cases like the plural formations in (2) are then handled as mapping of a prosodically circumscribed bimoraic base, i.e., , ,
220 ! Robert R. Ratcliffe
, <xaa>, <mif>, etc., to a partially-specified plural template , with material outside the base carried over in the derived forms. The prosodic circumscription analysis allows for an essentially infixational account of forms like those in (2). Yet, the analysis is obviously incompatible with the hypothesis that Arabic morphology can be exhaustively reduced to combinations of roots and patterns. One rule type (mapping to a template) may reference two radically different classes of structural descriptions: either a morphologically defined discontinuous string of segments, or a phonologically defined continuous string which is part of a word. The first type of operation can be analyzed as morpheme combination, but the second can only be analyzed as a combination of a morpheme with a phonological entity which does not have any reference and hence is not a morpheme. If we allow that some derivational rules must reference (phonologically-defined parts of) words rather than underlying consonantal roots, then it would seem that we must assume a lexicon in which both words and roots are listed. It is possible that some rules (e.g., level I rules) combine roots with patterns to form words or stems, and others (level II) operate on words or stems. This is the position taken by Keegan (1987) and by Ratcliffe (1990, 1992). But since some regularities must be stated in terms of operations on phonologically defined parts of words, it becomes worthwhile to ask whether erstwhile root-based morphology can not also be stated in terms of processes operating on strictly phonologically defined constituents.
An Alternative to Tiers: Horizontal Segmentation The crucial problem with the morpheme-based morphology programme is that it begs the question of morphological acquisition. As Anderson (1992: 291) argues: Word internal morphological structure ... is only inferred from the relation of words to one another; and what we have seen is that once we have described the relations themselves (which must be done in any event), the further step of attributing additional internal form to the items that are related is probably unnecessary. This argument is particularly cogent in the case of Arabic, because the erstwhile morphemes (discontinuous strings of consonants or vowels and syllable patterns) can never appear (on the surface)
Classical Arabic Re-considered ! 221
outside of the word in which they occur, and further such entities as ëthe consonants in a wordí, ëthe vowels in a wordí, and ëthe syllable structure of a wordí are well-defined (and universal) units of phonological structure. Once we begin to analyze these systems in syllabic and prosodic terms, extraction processes such as those proposed by Bat-el (1986, 1989) appear as phonologically natural as prosodic circumscription. The prosodic hierarchy of McCarthy and Prince admits only one dimension of prosodic structureónamely a linear or time dimension. But prosodic structure is at least two dimensionalóincorporating peaks and troughs of sonority (cf. Angoujard 1990). A syllable is more than a sequence of morae. It also necessarily incorporates a peak of sonority. My proposal is that just as infixation makes reference to a phonological base defined by vertical segmentation of the word into linear constituents, ëroot-andpatterní morphology (in Arabic and other languages) makes reference to a phonological base defined by horizontal segmentation of the word into peaks and troughs of sonorityóconsonants and vowels. Thus for the second example in (2), the following segmentation strategies are available: (3) root extraction by horizontal segmentation a a -e - -f- e - -f - -e- d ft r
base extraction by vertical segmentation m m|m m daf|tar
In addition to eliminating an idiosyncrasy from phonological theory, the analysis of consonantal roots as purely phonological entities extracted from words provides a way for solving a number of empirical problems, including issues of variation and change, which cannot be handled under the root-as-morpheme approach. For example, loan words are regularly accommodated into the system of pattern based morphology. In modern written Arabic, the English words film and bank have been borrowed and supplied with a ëbrokení or ëstem-internalí plural aflaam and bunuuk, respectively. The process can be analyzed in autosegmental terms as mapping of the consonants f-l-m and b-n-k to the plural patterns aCCaaC and CuCuuC, respectively (which are the productive plural patterns for CiCC and CaCC, respectively). How is this possible? It cannot be because b-nk is a morpheme in the loaning language, nor can it be due to any evidence from alternations with other words containing these ërootsí.
222 ! Robert R. Ratcliffe
It can, and I think certainly must, be due to the fact that speakers recognize the consonants in the word as a phonological unit, which can be extracted and mapped to a template. The assumption that Arabic allows at least two different types of word-based segmentation, vertical and horizontal, also accounts for certain patterns of allomorphy in Arabic. Under this assumption, for words of some phonological shapes, ambiguities in the interpretation of the base for derivation are expected to arise. The problem is that under a vertical (linear) segmentation Cvv syllables pattern with CvC syllables as bimoraic as opposed to monomoraic Cv syllables, while under a horizontal segmentation Cv and Cvv syllables, which have a single trough and a single peak of sonority, pattern together against CvC syllables, which have two sonority troughs. It is precisely those nouns which contain a Cvv syllable (CvvC-, CvvCvCand CvCvvC- stems) that show the greatest range of allomorphic variation in plural formation (see 2.3.4, below).Thus, for the most populous class of underived nouns, CvCC- stems, the productive plurals are aCCaaCun, CuCuuCun, and CiCaaCun. The variation in vowel quality /aa/ vs: /uu/ is partly conditioned by the vowel of the singular. (Singulars with /a/ vowel favour plurals with /u/, those with high vowel favour plurals with /a/óLevy 1971). The plural with initial a- has generally been analyzed, following a proposal of Levy (1971) as resulting from a rule of metathesis. So arguably the basic underlying plural form for these nouns is CvCaaC. Two-consonant CvC- stems show the same patterns, but with addition of a default third consonant in the plural: damun >> dimaa un ëblood,í µhamun >> aµhmaa un ëfather-in-lawí. Singulars, of the shape CaaC-, however, show an unusual pattern of allomorphy. They often resist analysis altogether, in which case they take the default ësound pluralí (ex. xaamun >> xaamaatun ëmaterialí). When they take a broken plural it may be the productive aCCaaCun type as in baabun >> abwaabun ëdoorsí. But quite frequently such nouns show a unique plural type CiiCaanun, baabun >> biibaanun, jaarun >> jiiraanun ëneighboursí. This variation can be accounted for as follows: horizontal segmentation gives a two consonant ërootí for these forms, and a third default consonant /n/ is added to fill out the third C of the template. But according to a vertical segmentation, the coda /a/ vowel patterns like the second consonant of CvCC-stem, and the second consonant of the input maps to third C position in the output template. No variation is expected in the plurals of CvCC- and CvC- nouns,
Classical Arabic Re-considered ! 223
because either type of segmentation would yield the same plural, as shown in (4). We assume for the moment that mapping to a template is the plural formation strategy in both cases (although an affixation analysis is also possible here, see Where templatic and non-templatic morphology compete: the broken plurals). When a two-consonant ërootí is extracted from CvvC-, the long vowel behaves like a short vowel, length is maintained, and /a(a)/ >> /i(i)/ apophony applies. (4) root extraction vs base extraction a u -e - -f- e - -f - -e- k lb n i aa
m m|m m kal}bun aa}bun
s s s kilaabun
s s k i l a a bun
a u -e - -f- e - -f - -e- d m n
mm|m damu}n
i aa s s s dimaaCun aa u -e - -f- e - -f - -e- b b n ii aa s s s b i i b a a Cun
>> kilaabun
>> dimaa un
aa s s d i m a a Cun
m m|m m baa}bun
>> biibaanun,/bVwaabun/
aa}bun s s b v C a a bun
We conclude then that there is no typological distinction among languages of the sort proposed by Noske or McCarthy. In terms of the way in which phonological structure of words is represented in
224 ! Robert R. Ratcliffe
the lexicon, English and Arabic (or any other language, for that matter) are not different. There is a difference between English and Arabic, and among languages generally, in the extent to which the morphological component exploits a universal element of phonological structureónamely the distinction between peaks and troughs of sonorityóvowels and consonants. Compare the case with reduplication. No one has proposed that the difference between languages which have reduplicative morphology and those which do not is due to fundamental differences in the underlying representation of words in the different languages. Rather, as McCarthy and Prince (1990a, 1995) have been at pains to point out, reduplication works by segmenting words into universally-definable phonological constituents, such as morae and syllables. Such constituents are also found in languages which lack reduplicative morphology, just as syllabic peaks and troughs are found in languages which lack apophony or fixedpattern morphology. It is not necessary to assume that syllables are morphemes in languages in which, say, reduplication of the first syllable of a word is a productive morphological process. In the same way it is not necessary to assume that the string of consonants in a word is a morpheme in order to account for ëroot-and-pattern morphologyí. Moreover, even from the point of view of morphology, English and Arabic are not as fundamentally different as the planar V/C segregation analysis would imply. The difference is one of degree not of kind, since English has some non-concatenative morphology, just as Arabic has quite a bit of concatenative morphology. Spencer (1988, 1991: 158ñ59) has noted there is a basic similarity between apophonic alternations like those in English sing/sang/sung and those in Arabic kataba/kutiba (ëhe wroteí / ëit was writtení). Under a planar V/C segregation analysis we either have to assume (as, for example Anderson 1992: 62 does) that these patterns are either completely incomparable, because the underlying forms are radically different in the two languages; or we have to assume (as Spencer does) that (some) English words do have complex lexical entries comparable to those which McCarthy proposes for Arabic words. This latter proposal introduces a major idiosyncrasy into English grammar in order to accommodate a marginal feature of English morphology. But the former position is also unappealing. The least we should expect from a linguistic theory with universalist ambitions is that it be able to account for similar phenomena in different languages in
Classical Arabic Re-considered ! 225
a similar way. The horizontal segmentation analysis eliminates the difficulty. English and Arabic both have word-formation processes whose structural description includes reference to syllable peaks. Arabic simply has more of them.
Domain of the Template The characteristic feature of templatic morphology wherever it is found, is that a fixed syllabic (and sometimes also vocalic) pattern is consistently associated with a particular functional/semantic category, either as the unique marker of the category or as a stem allomorph obligatorily associated with an affix which marks the category. Put another way, this means, in effect, that for some derivations variant inputs will be brought into conformity with an invariant output pattern.
Examples of Templatic Morphology in Various Languages Templatic morphology in Classical Arabic can be exemplified as in (5) by the elative (comparative and superlative) of adjectives and by denominal verbs of stem (binyan) II: (5) Arabic comparative/superlative adjectives and denominal verbs positive
s µ s s ˚ s µ s s ˚ s µ s s µ s ˚ s
sahlun kab i irun ¶abuurun j aahi l un noun
s µ s s µ s s µ s ˚ s s ˚ s µ s s ˚ s µ s
òarqun jildun xaimatun ruxaamun mariiŒun
>> >> >> >>
comparative m. s µ s ˚ s ˚ ashalu akbaru a¶baru aj halu
verb past s µ s ˚ s ëeastí òar ra qa: ëskiní ja l la da: ëtentí xayyama: ëmarbleí raxxama: ësickí marraŒa:
non-past 5 s ˚ s µ s ˚ s yu òarriqu yujallidu yuxayyimu yuraxximu yumarriŒu
f. s µ s µ suhlaa ëeasyí kubraa ëbigí ¶ubraa ëpatientí juhlaa ëignorantí
ëto ëto ëto ëto ëto
go to the eastí skin (an animal)í pitch a tentí pave with marbleí nurse the sickí
226 ! Robert R. Ratcliffe
In these cases both the syllable structure and quality of the vowels of the output words is consistent regardless of the syllable structure and quality of vowels of the input. Satisfaction of the output template is met by reading out vowels as well as affixes (such as the feminine -at suffix in xaimatun) and mapping the remaining consonants to the output template, as in (6). In (6), line 1 indicates the part of the input that is read out for derivation, line 2 the prespecified segmental portion of the template, and line 3 the prespecified syllabic portion of the template. (6) Arabic comparative/superlative adjective formation a u -e - -f- e - -f - -e- s hl
a ii u -e - -f- e - -f- e - -f- e --k b r
aa i u -e - -f- e - -f- e - -f- e --j h l
1
a
a u
a
a
2
s
s s
a
s s
u s
s
a
u
s s
3
Akinlabi and Urua (1993) identify a templatic pattern in the Benue-Congo language Ibibio. Monosyllabic verbs show one of three syllabic patterns in the base form but are regularly made to fit a heavy syllable template when the negative suffix is attached. (The suffix is treated as underlyingly -kv with the /k/ subject to spirantization between vowels and assimilation to a preceding consonant. The v assimilates to the stem vowel). (7) Ibibio negative verb forms (Akinlabi and Urua 1993) s µ s ˚ s µ +
kop yet ka se faak K
µ ˚s s koppo yet t e kaa a see e f akk a k
ëhearí ëwashí ëgoí ëlookí ëwedge between two objectsí ëhang on a hookí
Here the shape invariant is defined in syllabic terms only. The quality of both the consonants and vowels of the input are maintained in the output.
Classical Arabic Re-considered ! 227
(8)
kop
s
se
s
f aak
s s
s
s
Horizontal segmentation can be invoked to account for the spread of the stem vowel to the suffix, as spreading from peak to peak. (9) Spreading in Ibibio
o o -e - -f- e - -f- e - -f - -e k p k s
s
a a -e - -f- e - -f -e - -f - -e f k k s
s
ee e -e - -f- -f- - e - -f -s k s
s
The stem templates associated with affixes in Yawelmani Yokuts (Penutian languages of western North America) likewise show invariant syllable structure with both vowel and consonant quality carried over from the input. Note that in the Sierra Miwok examples, the quality of vowels is carried over but the relative order of vowels and consonants is not preserved. (Underlining indicates geminate segments.) (10) Sierra Miwok derived verb stems (Goldsmith 1990: 85) derived stems ˚s µ s s µs ˚s s µs ˚s base cvcvcc cvccvc cvccv ˚s µs + cvcvvc huteel hutellhuttelhutleërollí µs ˚s cvccv celku celukkcellukcelkuëquití s ˚s cvccv milli mili milli mil iësingí (11) Yawelmani (Archangeli 1984 [1988]: 198) passive-adjunctive formation base s ˚ µs cvcvcs µ cvc cawcawah-neelaw ëshoutí s ˚ + cvvc cíuumcíumoh-noolaw ëdestroyí ˚s µs cvcvv hoyoo hoyoh-neelaw ënameí s µ + cvcc amcamac-neelaw ëbe nearí s µ ++ cvvcc diiyldiyel-neelaw ëguardí ˚s µs+ cvcvvc biniitbinet-neelaw ëaskí
228 ! Robert R. Ratcliffe
In the Ibibio, Miwok, and Yawelmani examples segment quality is preserved but the syllabic pattern of the input is replaced by a fixed syllabic pattern. This is also the case in the formation of the plural verb stem in Pero, a Chadic language (hence distantly related to Arabic) which has up to now not been analyzed as having templatic morphology. According to the data given in Frayzingier (1977) the plural form of the verb has one of two syllabic patterns, each of which contains a core heavy-light syllable CvCCv sequence. (A minor exception is provided by verbs like biiro >> bi iro, which have a default glottal stop which resists gemination.) (12) Pero plural verb formation (Frayzingier 1977)
s ˚ ˚s ˚s
µs ˚s
µs ˚s ˚s ˚s ˚s
cv cvcv
cvvcv
cvccv cvcvcv
µ ˚s s cvccv ciyyo loffo
ci lofo bina ceko koofo biiro foojo fundo liguno
µs ˚s ˚s cvccvcv
bibb ina cekkuto koffo bi iro fo j jujo funduto li gguno
ëeatí ëbeatí ëwashí ëloseí ëpassí ëmake fireí ëpushí ëcookí ëanswerí
Finally, for English, another language which has been analyzed as not having templatic morphology, the past tense of one class of verbs is formed on a heavy syllable C t template, where C represents any permissible onset in English including consonant clusters and zero. In (13) complex nuclei, onsets and codas are indicated by double underlining. (13) English ë- tí verbs c cvc c c vc cv [
bai kæt° fait brI qenk ou
>> >> >> >> >> >>
C b k f br q
t t t t t t t ] (archaic)
buy >> bought catch >> caught fight >> fought bring >> brought think >> thought owe >> ought
In these cases the onset of the unique syllable of a monosyllabic verb (in one caseóbeseech/besoughtóthe last syllable of a bisyllabic verb)
Classical Arabic Re-considered ! 229
is read out of the input and mapped to the onset position of a fixed C t output. (14) ou -e -f - -e --
ai ai æ -e -f - -e - - -e -f - -e - - -e -f - -e -b f t k t
x
t s
t s
t
t
s
s
e I -e -f - -e - - -e -f - -e - - input q nk br t s
t
output
s
Here present tense forms of verbs of various segmental and syllabic patterns show a consistent past tense form. The shape invariant consists both of a predetermined syllabic shape and segments of predetermined quality in nucleus and coda position.
Templatic (Shape invariant) Morphology vs Prosodic (Quantity based) Morphology Since Marantz (1982), it has become customary in research on the morphology/phonology interface to draw comparisons between templatic morphology and reduplicative morphology. According to McCarthy and Prince (1995: 319), ëIn reduplicative and root-andpattern morphology, grammatical distinctions are expressed by imposing a fixed phonological shape on varying segmental material.í This formulation, however, obscures an important distinction. What templatic morphology and reduplication have in common is that they both involve manipulation of phonological quantity independent of qualityóthat is both can be analyzed as specifying empty phonological time, which is filled independently by segmental material. But they differ crucially in terms of what exactly is fixed or invariant. Reduplication (in its ideal form) is change-consistent morphology. It can be characterized in terms of a rule which changes an input in a consistent way by adding something to it. In templatic morphology, by contrast, the output is fixed or consistent, so that, given variant inputs, the relationship between input and output is inconsistent. We can thus define the formal/phonological properties of relationships between related words in a morphological system along a grid, the axes of which are (a) quantity vs quality and (b) change
230 ! Robert R. Ratcliffe
consistency vs output consistency. Reduplication, on the one hand, and templatic morphology of the type illustrated above in the Ibibio, Sierra Miwok, and Pero examples, on the other, line up on the same (quantity) side of the quantity/quality axis but on opposite sides of the consistency axis. Processes affecting quality only, such as vocalic apophony, can also be separated along the consistency axis. An example of output-consistent quality-based morphology is provided by passive voice formation in Arabic. The non-past (ëimperfectí) passive stem has an /u/ vowel in the prefix and an /a/ stem vowel regardless of the vocalism of the active. (15) Arabic voice oppositions active yaŒr i bu yakt u bu yaòr a bu
>> >> >>
passive yuŒr a bu yukt a bu yuòr a bu
ëbeatí ëwriteí ëdrinkí
On the other hand, tense contrasts in the Arabic verb involves a regular change in vowel quality. Verbs with a high stem vowel in the non-past form (/i/ or /u/) generally show a low vowel (/a/) in the past, while verbs with /a/ in the non-past form generally show a high vowel (/i/) in the past. (16) Arabic tense (aspect) oppositions non-past (ëimperfectí) yaŒr i bu yakt u bu yaòr a bu
>> >> >>
past (ëperfectí) Œar a ba kat a ba òar i ba
ëbeatí ëwriteí ëdrinkí
Past tense formation in Arabic thus represents a case of changeconsistent but output-variable quality-based morphology. The apophony marking tense in English verbs like sing-sang vs rang-rung, where it is a contrast between the past and present tense vowels rather than a particular vowel quality which indicates tense, is a somewhat murkier example of output variable quality-based morphology. Affixation (in the ideal or extreme case) is a mixed type of changeconsistent morphology where both the shape of the added material and the quality of the segments filling the shape are fully specified. Oddly enough, it is harder than most people assume to find examples, at least in the core inflectional morphology of languages, of
Classical Arabic Re-considered ! 231
the presumably ënormalí type of affixationóthat is affixation of a string fully specified both in syllabic/moraic structure and in all segmental features. Quite often the core affixation processes in a given language fall between this ideal and the ideal of reduplication. That is, the affixes are underspecified for one or more segmental features. For example, the most productive inflectional affixes in English (past tense t/d marker and the s/z marker of noun plural and 3rd singular verb) are unspecified for voicing, receiving this feature from the last segment of the word to which they are attached. Similarly, the vowels of suffixes in Turkish are partially unspecified, receiving backness and roundness specifications from the last vowel of the stem to which they are attached. Many of the Arabic templatic processes (such as those illustrated in (4) and the English - t verbs represent a mixed type of output consistent morphology where the shape or quantity of the output is fully specified and the quality of some of the segments which fill this output template are also fully specified. The extreme mixed form of output-consistent morphology, where both quality and quantity of the output are fully specified presumably does not occur in languages for practical reasons. (This would mean something like ëIn language X the plural for all nouns is the word ìgupîí.) (17) Typology of morphological relationships
quantity based quality based mixed
change consistent/ output variable reduplication
Arabic tense apophony Eng. tense apophony ënormalí affixation
output consistent/ change variable prosodic templates like those of Ibibio, or Penutian languages Arabic voice apophony Arabic templates Eng. t verbs
Output invariant morphology thus poses a problem which, I believe, has not yet been clearly recognized by general morphological theory. Both the word-syntax ëmorphemes as thingsí approach and the ëmorphology as rules or processesí approach, work with the assumption that there is consistency between base forms and derivatives. Output invariant morphology poses a particular problem for wordsyntax approaches, because there is no obvious syntactic analogue
232 ! Robert R. Ratcliffe
to the morphologically imposed shape invariant: one does not find sentences forced to conform with a fixed number of elements (words, syllables) through addition of default elements or deletion of elements. The closest analogue to the morphologically defined shape invariant in another domain of the grammar is the constraint in phonology. Essentially templatic or shape invariant morphology poses the same sort of problem for a rule-based morphological theory as ërule conspiraciesí did for early generative phonology (Goldsmith 1990: 321ff., Kisseberth 1970 and Sommerstein 1977). For forms like those in (4,6,9,10,11, and 12) it is, of course, possible to develop an analysis in terms of rules operating on the input to derive the output. But such approaches necessarily fail to reveal the core regularity in these systems. Goldsmith (1990: 85ñ87) demonstrates this point by contrasting rule-based and templatic approaches to Sierra Miwok forms like those in (9). Archangeli (1988: 178ñ79) also demonstrates the comparative disadvantages of a rule conspiracy approach to the data of Yawelmani. Ratcliffe (1997) develops a similar argument for English - t verbs, contrasting a templatic treatment like that above with Halle and Mohananís 1985 treatment of these verbs in terms of morphologically conditioned phonological rules in the framework of lexical phonology.
The Domain of Templatic Morphology in Classical Arabic The phonological analogy is, however, not entirely apt. The current research programme in Optimality Theory treats phonological constraints as an alternative way to deal with all phenomena heretofore handled by rules. In morphology, as I see it at least, there are two fundamentally different types of phenomena, one of which (change consistent morphology) can be economically analyzed in terms of rules, and the other (output consistent morphology) is better analyzed in terms of constraints. All Western traditions have taken the former type of morphology as the norm and effectively ignored the latter. Medieval Arabic grammar took the latter as the norm and is in effect a constraints-only approach to morphology. In this tradition, all words are analyzed as conforming to a pattern (miqaal, binaa ) of some type and emphasis is placed on cataloguing patterns and determining, if possible, the semantics or function of each pattern. There may be some discussion of what patterns are related
Classical Arabic Re-considered ! 233
derivationally (e.g., for a given broken plural pattern, what are the most likely singular patterns), but essentially no attempt to analyze formally the relationship between derivationally related patterns or wordsóand of course no attempt to construct formal rules as a way of characterizing such relationships.
Where the Templatic Analysis is not Necessarily Motivated: The Core Derivational Morphology: This tradition or habit of analysis continues to influence modern theoretical treatments of Arabic. For languages other than Arabic (Yawelmani, Miwok, Ibibio) templatic morphology is presented as in the examples above with input and output forms listed side by side, revealing the fact that regularities in shape invariant or templatic morphology are statable, as it were, down the output column, rather than across the inputoutput row. Analyses of Arabic generally obscure this fundamental point by simply listing sets of ëtemplatesí. (Contrast for example Goldsmithís [1990: 85] presentation of Sierra Miwok with the same authorís, [1990: 97], presentation of Arabic). This habit of analysis leads to some confusion about exactly how and to what extent Arabic exhibits templatic or shape invariant morphology. Thus, McCarthy and Prince (1990, 1995) offer the broken plurals in (2), partially repeated in (18), as a representative example of templatic morphology: (18)
sg. k a l b un d a f t a r un x aa t a mun Œa m i i run m i f t aaµ hun
pl. k i l a a b un dafaatiru x a waa t i mu Œa maa i r u ma f a a t i iµ hu
ëdogí ënotebookí ësealí ëpronouní ëkeyí
Yet, these are far from being representative examples of templatic morphology. The basic criterion does not hold here, since the outputs do appear to represent a regular change to the input. Perhaps the theory of templatic morphology has to be enriched to include cases where an output includes a shape invariant portion plus something else, namely a residue from the input. If we allow this possibility, then these patterns can be described as McCarthy and Prince propose in terms of mapping of an initial bimoraic sequence to an ëiambicí light-heavy syllable output, with the residue from the singular
234 ! Robert R. Ratcliffe
carried over. However, an analysis in terms of infixation (of long /aa/, for example) is also possible here. In fact McCarthy (1993: 200ñ204) assumes that infixation processes are involved in two areas of Arabic morphology, which were assumed to be templatic in his earlier analyses. These are the active participles of the base verb (ex. kaatibun ëwritingí), and the past stem of derived verbs II (kattaba ëhe caused to writeí) and III kaataba (ëhe correspondedí). Since these forms do not contain an initial prosodic foot, the PMH forces McCarthy to assume that derivation must proceed by some process other than template mapping. He proposes that that process is affixation of an empty mora to a prosodically circumscribed base. He thus arrives at the conclusion that verb and noun morphology in Arabic are completely asymmetrical: ëThe profound formal differences between nouns and verbs in Arabicóthe former principally templatic, the latter principally nontemplaticóaccord with the morphological organization of the language.í (McCarthy 1993: 211, cf. McCarthy and Prince 1990b: 35). I would argue that the appearance of profound formal difference in noun and verb in this account does not at all accord with the morphological organization of the language. Rather, it accords with, and follows from an uncritical acceptance of traditional habits of analysis (which take the past or perfect stem of the verb as basic) and from an attempt to force the data into conformity with the hypothesis that the shape invariant template must correspond with a prosodic foot. If we take the non-past form of the verb as basic, which is motivated by generative and general descriptivist considerations of simplicity and descriptive adequacy,6 and further assume that morphological rules operate on words rather than more abstract entities, there emerges not a profound formal difference but rather a profound formal similarity between nominal and verbal morphology in Arabic. For the most populous type of verbs and nouns, those which have a three-consonant stem (CvCCun, yaCCvC), the most productive ëinternalí derivations in both nominal and verbal morphology involve the same process (whether that process is described in terms of infixation or template mapping): (19) µs µs CvCCvC ya k t u b
˚ s µs µ s >> CvCvxCvC yu k a t t i b
(I >> II)
ëcause to writeí
Classical Arabic Re-considered ! 235
ya k t u b ka l b u n ka l b u n
yu k aa t i b k i l aa bun ku l a i bun
(I >> III) ëcorrespondí (plural) ëdogsí (diminutive) ëlittle dogí
The alternations in (19) could by analyzed either as involving suffixation of a bimoraic sequence to a prosodically circumscribed heavy syllable Cvx base or as mapping the same base to a partially specified output template. In other words, affixation and mapping yield the same result.7 (20)
mapping s µ CvX
or
moraic affixation µs + mm >> ˚s µs m m + mm >> m m m Cvx + vv >> CvCvv
CvCvv ˚s s µ In fact once we admit the principles of prosodic circumscription and moraic affixation into the grammar of Arabic, we virtually eliminate the need for shape invariant templates. All of the productive ëinternalí morphology of Arabic can be accounted for in terms of infixation of underspecified timing material. Beside the patterns in (19), two other productive categories are the active participle CaaCiCun and the stative verbs of colour or defect yuCCaCCa, yuCCaaCCa. The participle is of course a noun (adjective) derived from a verb. If we take it as derived from the non-past stem of the verb (which is motivated on semantic/functional groundsóboth nonpast stem and participle being associated secondarily with continuous aspect), it can be analyzed as formed by the productive rule affixing the same -vm- or -am- sequence to a heavy syllable base, followed by loss of the verbal prefix:8 (21)
am affixation prf. del. coda filling tubu >> tvbv >> kamtibun >> kaatibun
The stative (stem 9) verb forms (yuµhmarra, pf. iµhmarra ëto be redí) can be taken as deriving from the phonologically minimally marked plural adjective forms, like µhumrun ëredí (pl.), xuŒrun ëgreení (pl.), by the same rule of affixation to a circumscribed Cvx base, followed by prefixing of the verbal prefix and loss of the nominal suffix: (22) run
am affixattion prefixation coda filling run >> yuµ hmamra >> yuµ hmarra
236 ! Robert R. Ratcliffe
In nominal forms (plurals and participles) the added empty coda positions are filled by rightward spread yielding a long vowel rather than a geminate consonant. In verbs the added empty coda position is usually (form II and IX) filled by leftward spreading yielding a geminate consonant. This directional asymmetry is consistent with other differences in the nominal and verbal morphologyónouns are predominately suffixing, verbs predominately prefixing. Where Templatic Analysis is not Motivated Derived verbs and default syllabification: A second area of morphology usually assumed to represent templatic morphology are the derived verbal stems traditionally labelled VII, VIII and X. These were taken as representative examples of templatic morphology in McCarthy (1981), and have frequently been cited as such in general presentations of morphology and phonology (e.g., Goldsmith [1990: 98], Spencer [1991: 136ff.]). In a later work McCarthy rejects the templatic status of these for theory internal reasons, since once again the expected correlation between a morphologically defined structure and a prosodic foot fails to be met.9 We concur in McCarthyís judgement, but for a different reason. The examples in (23) illustrate the verbal patterns in question in the traditional way, with the base verb labelled ëIí. (23) Derived Verb Forms in Arabic non-past past I yaf ulu fa ala VII yanfa ilu infa ala VIII yafta ilu ifta ala X yastaf ilu istaf ala It is apparent that a principal marker of each of the derived forms is an affixó/n/, /t/ or /st/. We would analyze these forms as involving prefixing of the affix to a morphologically defined stem in the case of forms VII and X and suffixing the affix to the familiar Cvx base in the case of stem VIII.10 By the prosodic template analysis each of these derived forms must be treated as a separate template and the consonantal affix indicated as a morpheme on a separate tier, as in (24) based on McCarthy (1979). (McCarthy takes the past stem as the base and eliminates the prothetic glottal-stop vowel sequence from his templates.)
Classical Arabic Re-considered ! 237
(24)
VII n
VIII t
X st
CCVCVCV
CCVCVCV
CCVCCVCV
The recognition that templates are defined as fixed syllabic patterns makes it clear, however, that the verb patterns in (23) do not represent different templates. Each of these forms contains a stable sequence ..CaCa (perf.) ..CiCu (imp.) and the material to the left of this sequence is automatically syllabified according to universal principles of syllabification (Goldsmith 1990:123).11 It is expected that in a given language, syllabification will proceed from one end of the word arranging phonemic material into as many heavy syllables as possible. Thus, in the Arabic verb, when there is no derivational affix the structure of the non-past stem is CvC½CvCv (µs˚s˚s). When the derivational affix consists of a single consonant the structure is CVCCV½ΩCaCa (µs˚s˚s˚s), and when the affix has two consonants the structure is CVCCVC½CaCa (µsµs˚s˚s). It is not necessary for the morphology to specify the syllabic structure of these forms. An analysis of these forms as fixed morphologically imposed templates would imply that in addition to the form istaf ala for example distinctive patterns * isatafa ala (˚s˚s˚s˚s˚s˚s), * isatfa ala (˚sµs˚s˚s˚s), * isataf ala (˚s˚sµs˚s˚s), * isatafa la(˚s˚s˚sµs˚s), * isatfa la (˚sµsµs˚s), etc., might also occur. They do not. Moreover, the appearance or non-appearance of a vowel between first and second stem consonants in the derived stems (yanfa ilu vs yastaf ilu) is predictable under a default syllabification analysis but is a purely arbitrary and inexplicable phenomenon under a morphological template analysis. In short, the difficulty with an analysis which treats all of Arabic in terms of templates or shape invariants is that it overspecifies. It forces the morphological rules to specify information which can be specified more economically elsewhere. Much of what is traditionally specified by the template is either carried over from the base form (as in the case of plurals and diminutives) or supplied later by default phonological rules (as in the case of syllabification of derived verbs VII, VIII and X). The so-called ëcanonical-nounsí: Another area which has been claimed to represent, but in our view does not represent, templatic morphology is the set of so-called ëcanonical noun stemsí defined by McCarthy and Prince (1990b, 1995) as listed in (25).
238 ! Robert R. Ratcliffe
(25) the so-called ëcanonical noun patternsí (3C) CvCC- CvCvC- CvCvvC- CvvCvC(4C) CvCCvC-
CvvCvvCCvCCvvC-
Here too, the basic criterion of shape invariance does not hold. First for the set of three-consonant stems it is apparent that every possible combination of three consonants permitted by the syllable structure constraints of the language is exhibited. Moreover, the meaning of this set of forms is simply ëunderived nouní. As in the case of the stem vowel it makes no sense to define as a morphologically imposed shape invariant a set of forms which varies over all possible realizations of that set in the language and which is not associated with a consistent function. For four-consonant nouns, two of three possible bisyllabic combinations are regarded as canonical. The third possibility CvCvCC- (e.g., sijill-un ëregisterí, ëscrollí) is very rare and not regarded as canonical by McCarthy and Prince. Yet under the assumptions of prosodic morphology, this ëiambicí pattern should be common. In our view, what the set of patterns in (25) represents is not a set of canonical noun stems, but rather a set of ëpopulousí or statistically prominent stem patterns. McCarthy and Princeís count of these forms reveals not a morphological pattern peculiar to Arabic, but a type of analysis which is traditional to Semitic languages, although it could equally well be applied to other languages, no doubt with interesting results. That is, if by counting the attested syllabic patterns of words in a dictionary, one could answer the question ëhow efficiently do the word forms of a given language exploit what is phonologically possible in the language?í Although not all languages have templatic morphology of the type identified in Section 2.1, it seems safe to say that every human language has a set of populous (ëcanonicalí) word or stems forms defined in the same way that McCarthy and Prince define the canonical noun stems of Arabic. The explanation for the statistical predominance of a given set of forms is to be looked for not in abstract constraints on canonical form, but in general considerations of economy. Given that Arabic has twenty-eight distinctive consonants, three distinctive vowels, and only two syllable typesó light Cv and heavy Cvx (with any segment consonantal or vocalic permitted in the coda), simple mathematics shows that the set of possible two-consonant stems would not provide an adequate vocabulary (28[2]x3x2 = 4704). Permitting words to have three
Classical Arabic Re-considered ! 239
consonants increases the number of possible syllable patterns of the stem to five and increases the number of possible vowel combinations. The possible three consonant words approaches 6,00,000 which is comfortably more than adequate. If we do not allow any larger number of syllables than have already been allowed for three consonant stems, but permit four distinct consonants, the number of possible words increases to about seven million. Naturally, if we allow words with more syllables or more distinct consonants, the number of possible words quickly climbs far beyond what is necessary for an adequate vocabulary. Economy considerations predict that words of more than four syllables will be rare in a language like Arabic because they are not necessary, and that two-consonant words will be proportionately fewer than three-consonant words because the number of possible two-consonant words is proportionately much less than that of possible three-consonant words. Finally, according to the principal morphological criterion for defining ëcanonicityí, the claim that only the set of nouns in (25) is canonical is empirically false. Nouns are defined as canonical on the basis of being ëintegrated into the morphological systemí, specifically on their ability to form broken plurals (McCarthy and Prince 1995: 327). This implies that only nouns with the syllabic patterns in (25) should take broken plurals and they should do so obligatorily. Yet the list excludes many noun patterns which form broken plurals either obligatorily (e.g., two-consonant CvC- nouns) or potentially (trisyllabic 5-consonant CvCCvCvvC nouns like ankabuu¢). The list also includes noun patterns which usually do not take broken plural morphologyónotably derived active participles, which have the shape CvvCvC-. The principal criterion determining whether a noun takes a broken plural or an external plural is whether the noun is basic or derived (Levy 1971 and Ratcliffe 1992). Many underived nouns have the default (maximal) syllabification patterns CvCC and CvCCvC, and these virtually always form a broken plural. The other patterns (except the rare CvvCvvC) are much more common for derived nouns than for underived nouns. And the default case is that derived nouns take sound plural morphology. When nouns of the patterns CvvCvC and CvCvvC do show broken plural morphology they do so by virtue of the fact that they have become lexicalized. (Contrast, for example, kaatibun ëwriterí, ëwritingí regular agent noun from verb yaktubu, plural kaatibuuna, and the lexicalized kaatibun ëwriterí, ëscribeí, plural kuttaabun.)
240 ! Robert R. Ratcliffe
Moreover, unlike the distinction canonical/non-canonical, the distinction basic/derived has been noted as significant in determining the relative distribution of templatic and non-templatic morphology in other languages. Akinlabi and Urua (1993) propose, for example, that ë... Ibibio verb roots in combination with a variety of suffixes target a disyllabic trochee with two templates, though the verb roots themselves are not independently footed.í For Yawelmani, Archangeli (1988: 184) notes that, ëtemplate-supplying affixes appear adjacent to regular verb stems (and underived nouns) onlyí and she accounts for this by proposing that, ëonly items with no underlying template are subject to template affixationí (1988: 220). Most interesting from our point of view, Archangeli is motivated to propose the same exception to this principle that Ratcliffe (1992) proposes to account for sound/broken plural distribution in Arabic. She says that, ëlexicalized nouns which may at one time have been derived in the morphology are represented as underived nouns in underlying representationí, and hence may undergo template supplying affixation (Archangeli 1988: 234). We conclude that in systems with templatic morphology, the forms of derived words may be subject to arbitrary constraints imposed by the morphology. Underived words are not subject to such constraints. Of course there is not an infinite variety of syllabic patterns for underived words in Classical Arabic any more than in any other language. As in all languages, general considerations of economy constrain the shapes of possible words. Where templatic analysis is motivated: There are two principal areas where shape invariant morphology is unequivocally found in Arabic. First, certain morphological categories, generally less productive ones, tend to impose a shape invariant. Examples of this type are the comparative/superlative adjectives and the denominal verbs cited in (4). Note that in these cases derivation from the populous or default forms can be explained by affixation. The form II verb is normally derived from another verb. It is only in the case of verbs derived from a noun that mapping to a template is necessarily involved. For the comparative/superlative form, it is clear that for adjectives with an (unmarked) CvCCun pattern this form could be derived simply by affixation of a- in the masculine , -aa in the feminine followed by default syllabification. Only adjectives with syllabically marked patterns require an analysis in terms of template mapping.
Classical Arabic Re-considered ! 241
Defective and overlong nouns: The most significant domain in which template or shape invariant morphology is operable in Arabic is in the morphology of nouns and verbs which do not conform with the more populous CvCCun yaCvCCu patterns, i.e., biconsonantal stems with a long or short vowel (CvCun, CvvCun, yaCvC, yaCvvC), and noun stems with more than four consonants. Verbs which have a twoconsonant stem in the non-past of the base form (form I) yaCvC, yaCvvC (e.g., yas³ilu ëarriveí, yaquulu ësayí) consistently add a third consonant in the past and in the derived stems:12 (25) verbs non-past
-CvC -CvvC -CvCC -CCvC
past ˚s ˚ s ˚s CvCvC ya¶ilu (wa¶ala) yaquulu (qaala) yadul lu (dalla) yaktubu (kataba)
>> >> >> >>
II non-past ˚s µs ˚ s s cvCvCCvC yuwa¶¶ilu yuqawwilu yudal l i lu yukat t ibu
participle µs ˚ s s CvvCvC waa¶ilun ëarriveí qaa ilun ësayí daal lun ëindicateí kaat ibun ëwriteí
The traditional analysis of these forms, which assumes underlying roots and takes the past stem as the base form, wrongly predicts that the non-past stems should have the form *yaCwuCu or *yawCCuCu (e.g., *yaqwulu from past qaala, ërootí q-w-l, *yaw¶ilu from past wa¶ala, ërootí w-¶-l). The actual forms yaquulu and ya¶ilu then have to be derived by an ad hoc morphologically conditioned phonological rule. Taking the non-past as basic and assuming operations on words rather than roots, the forms of the past are predictable. Since the past stem is a derived category with a trisyllabic s s s (CvCvCv) template, extra consonants must be supplied to fill out the template of the past stem of two consonant verbs. Extra consonants must also be supplied in the form II derived verb. The participles of -CvC and -CvvC verbs are also fit to the CvvCvCun pattern by addition of a default consonant (i.e., waa¶ilun ëarrivingí qaa imun ëstandingí). Two-consonant nouns also are generally brought into conformity with a three-consonant template in derivation. Chart (26) compares plural formation of CvC, CvvC, CvCC (geminate), CvCC nouns: (26) nouns sg.
CvC-
µ hamun
pl. µs µ s s aCCaaC >> aµ hmaa un
ëfather-in-lawí
242 ! Robert R. Ratcliffe
CvvC- baabun >> abwaabun, (also) biibaanun ëdoorí CvCC- l i ¶¶un >> a l ¶aa¶un ëthiefí CvCC- ruknun >> a r kaanun ëpillarí Nouns with more than four consonants are brought into conformity with a four-consonant template in plural formation. (27) overlong nouns
CvCCvCmarkabun CvCCvCvC- zanbarakun CvCCvCvvC- ankabuutun CvCCvvCvC- barnaamajun
>> >> >> >>
˚s µ s ˚s ˚ s CvCaaCvC mar aa k i bu za n aa b i ku a n aa k i bu ba r aa m i ju
ëboatí ëspringí ëspiderí ëprogramí
We conclude that template satisfaction in Arabic functions primarily to bring the derived output of stems of less populous syllable-structure types into conformity with the outputs derived by affixation from the more populous types.
Against Minimality/Maximality Constraints: It will have been noted that our analysis differs significantly from that of McCarthy and Prince (1990a,b) in its treatment of defective and overlong stems. In our analysis these words provide the prime examples of templatic morphology; in McCarthy and Princeís analysis they are assumed to be outside of the system of templatic morphology altogether. McCarthy and Prince propose to account for the anomalies of these forms in terms of a set of independent constraints on the minimal and maximal size of an underived ëcanonicalí word: (a) the minimal word must be bimoraic (b) the maximal word is bisyllabic. For the overlong nouns, in addition to these constraints, their analysis requires ëa separate requirement that roots have at most four consonantsí and the operation of ënon-grammatical analogyí (McCarthy and Prince 1990b: 275). This approach is not only cumbersome and un-economical but explanatorily inadequate. A constraints approach wrongly predicts that subminimal words and supermaximal words do not occur. McCarthy and Princeís argument that such forms are statistically rare and morphologically exceptional does not answer this objection, nor do constraints account for the nature of the rarity and exceptionality of these forms. Two-consonant stems are not rare in Arabic except
Classical Arabic Re-considered ! 243
in terms of type counts in a dictionary. In fact this class includes much basic vocabulary such as body parts and kinship terms (yad-un ëhandí, ab-un father). Moroever, the principal apparent anomaly of these formsóthe tendency to add a default third consonant in derivationócan be explained most naturally as a requirement imposed by the fixed output template. The imposition of ëa separate requirement that roots have at most four consonantsí in the account of overlong nouns is even more troublesome. Under this analysis Arabic grammar would have two devices which do the same thingóone the fixed output template, a device widely found in other languagesóthe other, the consonantal root, a language particular device which imposes abstract (phonologically unrealizable) structure on abstract underlying forms. Further, as in the previous case, the analysis does not explain why the underlying root restriction is imposed only on the derived (plural) form of the word and not on the basic (singular) form. Most unhappily McCarthy and Prince propose that the loss of consonants in these forms is ëgoverned by non-grammatical, analogic factorsí, by a ëmechanism [which] is outside the formal grammarí. I see no difficulty with an explanation of this phenomenon in terms of analogy. But the distinction between grammar and analogy strikes me as illegitimate. Such a distinction would imply that adding -s to form the plurals of loan words in English is outside of the formal grammar of English. For that matter since -s plural formation is in general the result of analogic extension, most of such plurals would be outside the formal grammar of English. As I see it, analogy, the application of grammatical rules to new domains, is at the heart of grammar. Moreover templatic morphology is always essentially analogicalóa way of regularizing the output of minority stem patterns to bring them into conformity with more populous types. The processes of segregating a string of segments from a word, and adding or deleting segments to meet the requirements of a fixed output are precisely the processes which have now been identified as characteristic of templatic or shape invariant morphology language universally. The responsibility of a theory of templatic morphology is to account for these processes. McCarthy and Princeís recourse to a cryptic distinction between grammar and analogy is simply a way of avoiding the admission that their particular theory fails to account for the relevant data.
244 ! Robert R. Ratcliffe
In short, the fixed output template is entirely sufficient to account for the apparent three-consonant minimum/ four-consonant maximum requirement on derived words in Classical Arabic. If the fixed output template is acknowledged as part of the grammar (which seems to be necessary in any case) additional restrictions on the size of underived words are unnecessary as are additional restrictions on the number of consonants in ërootsí. The McCarthy and Prince analysis of these forms, with its redundant and ad hoc constraints misses the whole point of templatic or shape-invariant morphology.
Where Templatic and Non-templatic Morphology Compete: The ëbrokení Plurals: The strongest evidence for both (1) the hypothesis that Arabic morphology is word-based and (2) that Arabic quantity-based morphology includes both templatic and nontemplatic processes comes from patterns of allomorphy in the noun plural. As is well-known, Arabic shows a remarkably high degree of allomorphy in noun plural formation. For each of the principal masculine stem types the following productive range of variation is observed:13 (28) Plural allomorphy of masculine noun stems. CvCC CvC CvvC
>> >> >>
CvCCvC CvvCvC CvCvvC
>> >> >>
aCCaaC, CuCuuC, CiCaaC aCCaaG aCGaaC, CiiCaan CaCaaCiC CaGaaCiC, CaCaCat, CuCCaaC, CuCuuC CaCaaGiC, CiCaaC, aCiCCat, CuCaCaa , aCCiCCaa
For three- and four-consonant nouns with a default syllabic shape (CvCC, CvCCvC) the plural form represents a more or less consistent modification of the singular, involving a long vowel usually /aa/ after the second stem consonant. (For CaCC nouns with /a/ vowel the plural with /uu/ is preferred (Levy 1971), suggesting that a vowel apophony rule is applicable in the domain of CvCC nouns.) Research in Autosegmental and Prosodic morphology has accounted for these consistent or ëproductiveí alternations in a variety of ways, depending upon the answers to the following questions: (a) What part of the word is affected in the derivation? (a string of consonants vs a prosodic domain such as a mora or syllable) and (b) What is the operation performed? (mapping to a template vs affixation).
Classical Arabic Re-considered ! 245
One possibility is that plural derivation references a string of consonants and the operation involved is mapping to a template. This is essentially the approach of the earliest work on autosegmental morphology (McCarthy 1979). By this approach different templates are required for four-consonant and three-consonant words, unless the template with four C positions is taken as basic and unfilled positions are deleted. (29) Plural derivation by mapping to a CvCvvC(vC) template k al b
daftar
CvCvvC(vC) = kilaab
CvCvvCvC = dafaatir
A second possibility would again assume that plural formation references a string of consonants (without reference to intervening vowels) but in this case the operation involved would be affixationó insertion of /aa/ after the second consonant of the stem. (This approach is taken by McCarthy 1983). A third possibility is that plural derivation references a prosodically defined part of a word, such as an initial bimoraic sequence (this is the proposal of McCarthy and Prince 1990a, 1990b, 1995). The derivation could either proceed by affixation of empty morae to this base or by mapping this sequence to a light-heavy syllable template. In the latter case the rest of the word is unaffected and carried over into the plural. Heathís (1987) template plus projection model and Hammondís (1988) templatic transfer model also assume some type of partial mapping. A fourth possibility is a variant of the third, but in this case the rule would specifically reference a heavy syllable. The question we then wish to ask is, can any of these derivational possibilities be extended to account consistently for the ënon-productiveí plural forms? The answer to this question appears to be no, yet the experiment yields an interesting and surprising result. The variety of possible ways to segment and derive words, which seems to be exceedingly redundant if we are only seeking a way to account for the single consistent alternation between the populous singular stems and their corresponding plurals, turns out to be directly in proportion to the apparently inconsistent and anomalous variety of plural allomorphs of the minority stems. Thus, if consonants only are mapped to the productive CvCvvC template, the long vowel of CvvCvC and CvCvvC stem nouns would not be reflected in the plural. In the case of CvCvvC stems, singular and
246 ! Robert R. Ratcliffe
plural would be identical unless a rule of vowel apophony applies as in the base noun. The expected and actual forms are given in (30). (30) By mapping consonants only to the canonical 3C output template singular CiCC >> CaCC >> CaaCiC >>
expected plural CvCaaC CvCuuC, CvCiiC CvCvvC
CvCiiC CvCaaC
CvCaaC CvCuuC, CvCiiC
>> >>
closest actual form CiCaaC, aCCaaC CuCuuC CaCaCat ( >> >> >> >>
expected plural CvCaaC CvCaaCvC CvvCaa(C) CvvCaaC CvCaavvC
closest actual form CiCaaC, aCCaaC CaCaaCiC CiiCaan CuCCaaC CaCaa iC
Contrariwise, rules referencing prosodic constituents rather than sequences of segments would treat Cvv and CvC syllables alike. The coda vowel, falling into an onset position as a result of affixation or else mapped to an onset position, would be realized as a glide, and the output would have an initial sequence conforming with the common plural templates. Affixation and mapping here yield the same result, assuming that vowel codas are mapped to Cís, and that material outside the CVX base is carried over in the derived form. (32) By affixing /aa/ to an initial CVX (mm) sequence or by mapping same sequence to a CvCaa.. template, with V codas mapped to Cís singular CvCCvC
>>
expected plural CvCaaCvC
closest actual form CaCaaCiC
Classical Arabic Re-considered ! 247
CvvC
>>
CvGaaC
CvvCvC CvCvvC
>> >>
CvGaaCvC CvCaaGvC
aCGaaC, CiGaCat (> ˚sµs singular C >> CvC >> C >>
expected plural C CvC C
CvC CvC
CvC CvC
>> >>
closest actual form CiCaaC, aCCaaC CaCaaCiC aCGaaC (*CaGaaC), CiyaCat CawaaCiC CuCaCaa
Thus, much of the allomorphy in the plural system can be explained as a result of competition in the grammar between, on the one hand, templatic (output invariant, change variable) and non-templatic (change invariant, output variable) morphology, and on the other between vertical (linear) and horizontal (C/V) segmentation processes.14 Chart (34) provides a summary. (34) plural allomorphy as rule allomorphy sg.> CvC rule 15 ?? 16 17 18
CvvC ??
CvCC
CvCaaC CvCuuC CvCaa? CvvCaa? ì ì CvCaaG CvGaaC ì ì ?? ìì ìì
CvvCvC
CvCvvC
CvCaaC CvCuuC CvvCaaC CvGaaCvC ìì
CvCaaC CvCuuC CvCaaGvC ìì CvCvGvvC
There are a number of small discrepancies between the actual patterns and those predicted. Some of these are explained by generally applicable rules. Thus, it seems to be a general rule in Arabic that a stem final long vowel can be shortened and compensated for by a
248 ! Robert R. Ratcliffe
suffix -at. In the plural system a sequence CaCi/u- never appears, while aCCi/u- is common, indicating that a metathesis rule is generally applicable in this domain of the grammar. For plurals like kuttaab (not *kuutaab) wuzaraa (not *wuza aar) and biibaan (not *biibaa ) the syllabic and vocalic patterns are as predicted, but the onset and coda positions are filled in unexpected ways. For plural CuCuC (rather than *CuCuuC) the vowel quality is as expected, but the short vowel of the second syllable is anomalous. Yet, every predicted form finds some plausible reflex and virtually every irregular or idiosyncratic plural is matched by a predicted form. Since noun plural morphology allows this variation, while verbal morphology does not, we conclude (again pace McCarthy and Prince) that the noun morphology is essentially less ëtemplaticí than the verbal morphology.
Template May be a Property of the Affix and Not Necessarily a Morpheme in Itself Another argument for seeing Arabic morphology as primarily affixational with template satisfaction simply as an analogic device for regularizing minority stems is that it is not only the output of ëinternalí processes (affixation to a circumscribed base) that tend to conform to a template. Some types of ëexternalí processes (affixation to a stem) also require (or tend to require) that the output of the derivation conform to a fixed template. The relative adjective is formed by suffixation of -iiyun. Two-consonant stems always and threeconsonant stems with a long vowel usually, are made to conform to a three-consonant µs s (CvCC-) or ˚s ˚s s (CvCvC-) pattern before attachment of this suffix (Wright 1896:149ñ64). (35) Relational Adjective (Nisba)
CvCCCvCvCCvCCvC-tCvCCvC-atCvC-atCvCvvCCvCvvC-at-
µs s CVCC-iiyun òa m s-iiyun
ësuní òamsun ëkingí malikun ëhow muchí kam kamm-iiyun ëdaughterí bintun b i n t -iiyun ëfath.in lawí µ hamun ëlanguageí lu atun ëgumí liqatun ëislandí jaziiratun ëenemy (f.)í aduuwatun
˚s ˚s s CVCVC-iiyun ma l ak-iiyun (ëquantitativeí) ba naw-iiyun µ hamaw-iiyun lu a w-iiyun l i q a w-iiyun j a z i r-iiyun a d a w-iiyun
Classical Arabic Re-considered ! 249
Archangeli (1988) says that the templatic morphology in Yawelmani Yokuts differs from templatic morphology in Semitic languages in the following way: ëin Semitic, the root template is fixed by the morphology independently of any affixation. In Yokuts, the affixation determines the template of the root (ibid.:175, cf. Noske 1985: 344).í In Sierra Miwok, ëthe various derived templates appear in cases of derivational suffixationí (Smith 1985: 364). However, as the Arabic relational adjective shows, cases where affixation requires or determines the templatic shape of the stem, are known in Semitic languages. Arguably Arabic comparative/superlative adjective formation, as analyzed earlier (in Examples of templatic morphology in various language and also fits this pattern Where templatic analysis is motivated).
Conventions for Filling Empty Timing Slots (and for Deleting Excess Segmental Material) In the preceding analysis we have noted a number of cases where segmentally unspecified or underspecified timing material is affixed or supplied by a template and filled later in the derivation by segmental material. The question of precisely how such empty timing material is filled by segmental material (and especially if any universal principles govern this process) is an issue which has inspired a great deal of interest and controversy among students of templatic and quantity-based morphology.
Directional SpreadingóProblems McCarthyís (1979, 1981) original autosegmental analysis incorporated a proposal that segments associate to templates according to the same conventions which Goldsmith (1976) proposed to govern association of tones to tone bearing segments in tonal systemsóthat is default left to right (L-to-R) association. The principal evidence for a direction of association is found in cases where the number of positions on the template and the number of segments which can fill those positions are unequal. Default L-to-R association predicts that where there are more positions in the template than there are segments the rightmost segment will spread and that where there are more segments than template slots the rightmost segment will delete.
250 ! Robert R. Ratcliffe
Significant exceptions to these predictions in the productive derivational morphology of Arabic have been recognized from the very beginning (Farwaneh 1990, Goldsmith 1990: 93ñ98, Heath 1987: 69ñ71, Hobermann 1988, 1992, 1995, Hudson 1985: 100, 1995, McCarthy and Prince 1990b and Yip 1988). Most importantly, the highly productive derived stem II (yukattibu/kattaba ëhe causes to writeí from yaktubu/kataba ëwriteí) must be treated as exceptional, while the rare stem IX (yaµhmaaru/ iµhmarra ëit is redí) is regarded as regular. Yipís (1988) alternativeódefault edge-in associationó solves some problems. Yet this proposal is also based on the assumption that association conventions in templatic morphology are analogous to tonal association conventions. The assumption of default directionality appears to me problematic for theoretical as well as empirical reasons. First, it is appearing increasingly less likely in light of more recent work on tonal systems that a single directionality default governs tonal association (Goldsmith 1990: 29ñ30, Odden 1995). Second, the tonal analogy loses its appeal in an analysis that assumes that mapping rules operate on phonologically defined parts of words rather than ëroot autosegmentsí.15 Finally, default directionality and CV notation are closely linked. In light of the fact of reanalysis of templates in syllabic terms the attempt to account for the filling of all empty ëC-positionsí or all empty ëV-positionsí in a template in terms of a universal directionality parameter seems fundamentally misguided. It is only an accident of the CV notation that onset and coda consonants (or nucleus and coda vowels) appear to be entities of the same type. In prosodic theory, empty codas and empty onsets have profoundly different status, since syllable codas have moraic weight, while syllable onsets do not. Coda vowels and nucleus vowels also have different status since, although both have moraic weight, nucleus vowels are required by universal syllabic well-formedness conditions, while codas are non-obligatory. Onset and nucleus filling have the effect of maintaining well-formed syllables and hence can be analyzed as triggered by syllabic well-formedness constraints. Coda filling, on the other hand, generally has the effect of maintaining a distinctive contrast between light and heavy syllables, and are thus a response to structural constraints imposed by the lexicon or the morphology. Since the different types of ëfillingí have different effects and are triggered by constraints operative at different levels of the grammar,
Classical Arabic Re-considered ! 251
there is in principle no reason to expect that a single default governs them all. Rather, we should be looking for three independent defaultsóan onset filling default, a nucleus filling default, and a coda filling default. If any unity is to be found, moreover, it should be between the onset filling default and the nucleus filling default, since both are triggered by syllable structure constraints. It is most significant, as I see it, that all three types of filling are widely found as purely phonological processes in languages which lack prosodic or templatic morphology. Consonant prothesisóadding of default onset consonant is found in many languages which prohibit vowel initial syllables. Vowel epenthesisófilling with a default an empty nucleus required by syllable structure constraintsó is also widespread. Finally, compensatory lengthening of a vowel, or more rarely compensatory gemination of a consonant, is widely exploited as a way of maintaining the moraic weight of a coda whose segmental features have been erased in the course of a derivation or which were unspecified to begin with. Thus, if there are any universal defaults which govern filling rules, this universality is to be looked for not in an abstract analogical equivalence between morphologically motivated spreading of segments and phonologically motivated spreading of tones. Rather, universality is to be looked for in the formal identity of the filling rules required by quantity-based morphology and the filling rules independently needed by phonology. The significant point about the filling rules found in languages with quantity-based morphology is that in these languages morphological rules, by directly manipulating the timing tier, create the situation in which phonological filling is called into play. In languages without such morphology, filling processes may be called into play in various other ways, such as through nativization of loan words, or through juxtaposition of morphemes in compounding or concatenation.
Onset-filling and Coda-filling in Arabic In the analysis which I have outlined for Arabic above, rules which affix empty timing material create the need for only one type of fillingóspreading of a vowel rightward or a consonant leftward to fill an empty mora in coda position. Morphologically motivated coda filling in Classical Arabic is thus analogous to phonologically
252 ! Robert R. Ratcliffe
motivated segmental spreading of a type widely observed in compensatory lengthening and related phenomena (cf. Lowenstamm and Kaye 1986).16 In Classical Arabic coda-filling is exclusively left to right (vowel lengthening) in nouns and predominately right to left (gemination) in verbs. (36) Coda-filling
vb. yukamtibu
n. kilambun
The only condition in which empty onsets appear and in which spreading across a vowel might be expected are in the case of template mapping of two-consonant CvC and CvvC stems (and also threeconsonant stems with initial Cvv..). In verbs and productively in nouns,17 empty onset positions are filled by a default consonantó /w/, becoming / / in the environment after long /aa/. As the examples in (25) and (26), above show, two-consonant verbs add a default consonant to the left edge of the stem (ya-¶ilu >> wa¶ala, yuwa¶¶ilu) while two-consonant nouns add a default consonant to the right edge (damun >> dimaa -un). The different position of the default consonant in two-consonant verbs and nouns is consistent with the assumption of a different directional bias in association for verbs and nouns. Affixation is also predominately leftward in the verb (prefixation) and exclusively rightward in the noun (suffixation). In verb and noun stems with a medial long vowel, the coda vowel maps to an onset position in the template creating a medial glide. The derivation of the stem II verbs and the plural nouns proceeds as follows. I assume that the segments filling the peripheral Cv- and -(v) in verbs and -v(C) in nouns are pre-associated to corresponding peripheral slots on the output template (or perhaps rather that these inflectional affixes are bracketed out and play no role in the derivation). The onset and coda segments of the remaining stem are then mapped to the onset positions of the output template, from right to left in the verb, from left to right in the noun, with default consonants supplied to fill empty onset positions on the template. In the case of CvvC stems, the coda vowel maps to an onset position forcing the vowel to resyllabify as a glide. Finally, empty coda positions in the output template are filled by local spreading, which also proceeds right to left in the verb, and left to right in the noun.
Classical Arabic Re-considered ! 253
(37) Association and Default Onset-filling verbs input ya CvCu ya Cv v C
ya C Cv C
CmCmmCmCm
CmCmmCmCm
CmCmmCmCm
s s s s yuGvmCiC
s s s s yuCvmGiC
s s s s yuCvmC i C
CvC un
Cvv
C un
CvC Cun
CmCmmCmC
CmCmm CmC
CmCmmCmC
s s s CvCvmGun
s s s CvCvmCun
s s s CvCvmCun
output
Geminate Root and Overlong Nouns Re-evaluated Before we can reject default directional association altogether, it is necessary to look again at the two notable idiosyncrasies of Arabic morphology for which default L-to-R association and spreading does appear to provide an explanation. The first is the stray erasure phenomenon that appears in the plurals of nouns whose singulars have more than four consonants ( andaliib >> anaadil, ënightingaleí with the loss of final /b/). The loss of final consonant in the plural of these nouns presumably represents a case where a rightmost segment goes unassociated because there is not a slot in the template for it to occupy. The second is the skew in the distribution of roots with only two distinct consonants. Roots of the type XY Y (traditionally termed geminate roots), as in tamma ëhe completedí madda ëhe stretchedí are common in Arabic while XXY roots are non-existent and XY X roots are rare (cf. Greenberg 1950). McCarthy (1979) proposed that geminate stems are underlyingly biconsonantal (XY) with default spreading of the second consonant. In fact the analysis of both of these phenomena in terms of an L-to-R default rests on incomplete and inadequate data.
Plurals of Overlong Nouns: In the first case, as Yip (1988) and McCarthy and Prince (1990a) point out, it is not always the final consonant of five-consonant nouns which is lost in these cases (cf.
254 ! Robert R. Ratcliffe
the examples in 27, e.g., zanbarakun >> zanaabiku ëspringí, where a medial /r/ is lost). McCarthy and Prince (1990a: 273ñ75) attempt to salvage the L-to-R default by treating these plurals as non-templatic. ëThe loss of a consonant is not a response to template satisfaction, which predicts a loss of peripheral consonant only ...í. But this analysis does not hold. Template satisfaction only predicts loss of a consonant in these cases. That the consonant lost will be peripheral is predicted not by template satisfaction but by default L-to-R spreading. Template satisfaction is met in these cases. Default L-to-R spreading is violated. (Deletion defaults are discussed in later.)
Biconsonantal Interpretation of Geminates: McCarthy (1979, and later) interpreted the skew in the distribution of roots with only two distinct consonants as following from the Obligatory Contour Principal (OCP) (adjacent identical autosegments are not allowed) and the L-to-R default. The OCP precludes underlying roots with adjacent identical segments, so XYY roots must be underlyingly XY. The L-to-R default then predicts that the surface form must emerge as XYY rather than XXY. This analysis would not be permitted within the framework developed in this paper, since I do not assume underlying roots and since underived words are not formed on a template. Yet, even assuming a grammar based on roots, an underlying two-consonant root for these geminate stems cannot be legitimately postulated. The first problem with a biconsonantal analysis is essentially the same as Kiparskyís (1968) alternation argument against Chomsky and Halleís (1968) postulation of an underlying /x/ segment in English words like fight. There is no way a child could deduce the proposed underlying form on the basis of the data it hears. Contrast the case of CvvC- stems which are assumed to derive from underlying roots with medial glide. In an analysis which assumes underlying roots, it is legitimate to assume that a word like baabun ëdoorí has an underlying root b-w-b (although I would not make such an assumption). This is first because Arabic has a productive phonological rule deleting a glide between short vowels (except a high vowel followed by a low vowel), so that a word *bawabun would surface as baabun.18 Second, the ëunderlyingí /w/ (re-)appears in some forms of the word (pl. abwaabun, e.g.,). For a form like sammun ëpoisoní, on the other hand, there are no forms of the word which appear with only a single /m/ and there are no phonological rules which might be responsible for geminating the final /m/ in this environment.19
Classical Arabic Re-considered ! 255
The most telling argument against the biconsonantal analysis of geminates, however, is simply that true surface two-consonant CVC stems do exist in Arabic. Although the number of these is small, there are nonetheless a few minimal pairs, in which a biconsonantal and geminate contrast:20 (38)Hh µ amun Hh µ ammun amatun ummatun
ëfather-in-lawí ëmaking something hot (verbal noun)í ëslave girlí ënationí, ëcommunityí
smun sammun
ënameí21 ëpoisoní
òafatun òaffun
ëlipí ëgauzeí
True two-consonant words never undergo spreading, even in derivation. Rather, when these are mapped onto a 3C template, an extra consonant is supplied to fill the extra C slot by default, not by leftto-right spreading ( µ hamun >> aµ hmaa un).
Defaults in Other Languages For templatic and quantity based morphology in languages other than Classical Arabic, L-to-R spreading as a default has been nearly universally rejected. Heath (1987: 12) assumes both edge-in and Lto-R spreading on a category particular basis for Moroccan Arabic. Buckley (1990) argues for edge-in association followed by L-to-R spreading for Tigrinya. But the majority of researchers working on languages other than Classical Arabic have rejected directionality altogether. Hobermann (1988, 1992, cf. also 1995) on the basis of analysis of four Semitic languages, Syriac, Classical Arabic, Bliblical Hebrew, and Levantine Colloquial Arabic, argues that local spreading (lengthening or gemination) is preferred to long-distance spreading (that is spreading of a consonant across a nucleus, or of a vowel across a consonant). Local spreading necessarily involves spreading into or out of coda position, and in fact all of Hobermanís examples can be analyzed as coda filling. Rather than spreading a consonant to fill empty C positions, ëthe Ethiopian Semitic language Amharic chooses the option of inserting a default segment, illustrated by the appearance of t in the final slot of certain templatesí (Broselow 1995: 182, citing Broselow 1984).
256 ! Robert R. Ratcliffe
In Tashlhyt Berber, there is no evidence for directionality in mapping to templates (Dell and Elmedlaoui 1992: 112), empty C positions in most templates are filled by a default consonant /w/ or /y/, or more rarely /m/. Dell and Elmedlaoui give examples of one templatic pattern involving a prefix tiX- (ex. tirrugza ëmanhoodí from a-rgaz ëmaní), where the coda of the prefix is filled by the first consonant of a stem with four or more consonants and by copying (i.e., gemination or coda filling) of the first consonant of stem with fewer than four consonants. In Sierra Miwok, ëno segment is repeated non-adjacentlyí, (Smith 1985: 366). In other words, the only spreading involves consonant gemination or vowel lenthgening, which in the examples given by Smith can always be interpreted as coda filling. If there are insufficient segments on one or both melodic tiers then in the case of the derived templates ... no spreading takes place.... What does happen is that default rules associate the unmarked vowel /y/ and the unmarked consonant / / with any unlinked skeletal slot. (Smith 1985: 367) In the examples given, in fact, the consonantal default appears only in onset or stem-final position. Smith also notes (1985: 368) that default consonants (or vowels) in this language only appear in derived structures. For Yawelmani, Archangeli assumes left to right association (1988[84]: 4), but the only type of spreading which is found is syllable internal spread of a vowel. There are no geminate consonants. Empty C slots are filled by a ëfloating consonantíó/h/, / /, or /l/ (p. 196). For the so-called y-grade formations of Choctaw, Hammond (1993), Lombardi and McCarthy (1991) and Ulrich (1994) have developed an analysis in terms of edge-in association, but without directional spreading. Under these analyses empty onsets are filled by a default consonant /y/ and empty morae are filled either by vowel spreading or by leftward spread of a /y/ in the onset of the following syllable. Ulrich and Hammond also allow for onset filling by local spreading of a consonant in coda position. Lombardi and McCarthy on the other hand treat the same data in terms of affixing of a mora, which falls in syllable coda position and is filled by spreading of an adjacent segment.
Classical Arabic Re-considered ! 257
Universal Defaults Filling Defaults: Although various labels have been applied to it, (local spread, spreading within a syllable, copy to an adjacent syllable, lengthening, gemination) the process by which an empty coda is supplied by morphology and filled by spreading of an adjacent segment from either the left or the right has clearly been established as a significant operation in quantity-based morphology. Spreading may be from either direction on a language particular or category particular basis, but no alternative form of coda filling has been attested. The same default governs compensatory gemination or lengthening in languages in which quantity is not exploited morphologically, but is distinctive lexically. For filling empty onset positions or empty extrametrical codas in word final position, there would appear to be not one but three default options, one or the other of which may be favoured on a language particular basis. (a) An empty position may be filled by a default consonant, such as a glide or glottal stop, which has a high sonority or shares a high number of features with a vowel (phonological default). The quality of the consonants filling this role show a restricted range across a variety of languages. (b) An empty position may be filled by a language particular default consonant (like Amharic /t/, Berber /m/, Yawelmani /l/). Often this consonant is one which is otherwise prominent as an affix in the particular language (the morphological default or pseudo-affix). (c) Finally spreading (i.e., reduplication) is an option, either rightward or leftward. (39) Universal Onset filling defaults (a) phonological defaultóhigh sonority consonant (b) morphological defaultópseudo-affix (c) spreading defaultócopy of adjacent consonant All three possibilities are attested in the Pero plural verb forms listed above in (12). The phonological default is / / (which does not undergo gemination), as in biiro >> bi iro ëmake fireí. Also /y/ is a default for single-consonant stems only as in ci >> ciyyo ëeatí. The morphological default (as in a number of other Afroasiatic languages) is /t/ as in ceko >> cekkuto ëloseí. Finally, R-to-L spreading is found as in bina >> bibbina ëwashí. Superficially L-to-R spreading also seems to be attested in forms like foojo >> fojjujo ëpushí, but this is rather a case of assimilation of the default /t/ to the stem /j/.
258 ! Robert R. Ratcliffe
(40) Onset filling in Pero plural verbs 1. 2. 3. [
ci biiro ceko fundo bina foojo
ciyyo bi iro cekkuto funduto bib bina fo j jujo
ëeatí ëmake fireí ëloseí ëcookí ëwashí ëpushí ]
All three possibilities are also found in the filling of extrametrical final consonant in the plurals of two-consonant nouns in Geez, as exemplified in (41). The phonological default is /w/, the morphological default /t/ and spread goes left to right. (41) extra-metrical coda filling in Geez noun plurals 1. 2. 3.
µ ham sem Œar
>> >> >>
aµ hmaaw asmaat aŒraar
ëfather-in-lawí ënameí ëenemyí
In Arabic the phonological default ( , w) is overwhelmingly preferred. The /n/ in baabun >> biibaanun arguably represents the pseudo-affix option, although a right-to-left spreading of the consonant of the indefinite article suffix (ënunationí) is also possible. The spreading option is virtually never employed. One case is in plurals of CvvCvvCun nouns like daabiix >> dabaabiix. Spreading here is R-to-L since it is the vocalic coda of the first syllable which falls into an onset position in the derived form and must be filled by a consonant (thus underlying daCaabiix).
Deletion Defaults: Cases of deletion of a consonant to satisfy a templatic output are much less common than addition of a default consonant for the same purpose. Outside the Semitic domain, the only example I am aware of is the Sierra Miwok example given by Goldsmith (1990: 87): basic verb stem tolook, derived from the noun tolookoòu ëthreeí. Even in Geez, which has a morphology very similar to Arabic, the broken plurals of overlong nouns do not show loss of consonant (Ratcliffe 1992). Thus, the search for universals would appear to be jeopardized by the fact that most of our data for deletion defaults comes from the single language, Classical Arabic. Within Classical Arabic, the traditional grammarians have established that there is a range of variation for the consonant chosen for deletion. The facts are discussed by both McCarthy and Prince
Classical Arabic Re-considered ! 259
(1990a: 274ff.) and by Yip (1988). Besides loss of a final consonant, the consonant lost may be a high sonority consonant (y,w, r) or a consonant which is otherwise prominent in the language as an affix (t, n, m). As noted earlier (in plurals of overlong nouns) the existence of this range of choices for deletion provide some of the most troublesome counter evidence for any theory of default directionality, and neither Yip nor McCarthy and Prince are able to provide a theoretically consistent account for them. In fact, in light of the previous discussion of filling defaults, the problem is easily resolved. The range of defaults for consonant deletion in Arabic is exactly the same as (or rather the mirror image of) the range of defaults for onset filling cross-linguistically. (42) universal consonant deletion defaults 1. phonological defaultódeletion of high sonority consonant 2. morphological defaultódeletion of pseudo-affix 3. spreading defaultódeletion of edge-most consonant Furthermore, it is apparent that the strategies available for determining a consonant to drop is essentially the same as the strategies available for determining the part of the word over which a derivational rule applies in the first place, (as discussed in The Consonantal Root). First applying a horizontal segmentation strategy a speaker may strip off the sonority peaks (vowels) of a word and thus identify a consonantal ërootí as the basis for a derivation. In cases where the remaining consonants are still too many for the positions in the output template, the same strategy identifies the highest sonority consonant as the target for deletion. Applying a vertical segmentation strategy a speaker identifies a string of a certain length as the basis for a derivation. By the same strategy material at the edge of the word which exceeds a certain length are eligible for deletion. Finally, on the basis of knowledge of alternation patterns in a given language, a speaker may separate a stem from an affix and perform an operation only on the stem. In the same way a speaker may (mis-) identify as a target for deletion, a segment or string which is formally similar to an affix widely found in the language.
Conclusion Most approaches to morphology assume that regularity or consistency in a morphological system is to be looked for in terms of rules
260 ! Robert R. Ratcliffe
relating inputs to outputs. Evidence from a variety of languages (Yawelmani, Sierra Miwok, Ibibio, Tashlhyt, Pero, English) now show that regularity of a different type is also quite common in natural languages. In templatic morphology regularity consists in the fact that derived words associated with a particular function show a consistent syllabic (sometimes also vocalic) pattern regardless of the phonological form of the source from which they are derived. The fixed pattern or template is thus most economically analyzed as a constraint on output of a derivation, and templatic morphology as constraint-based, rather than rule-based. The Arabic and Semiticist tradition of morphological analysis assumes that shape invariance is the norm. Modern formal linguists working on Semitic languages have largely adopted this assumption uncritically. Hence we find modern researchers proposing templatic solutions to problems in the analysis of Arabic which perhaps admit much simpler non-templatic solutions. The case of the statistically prominent syllabic patterns of the underived noun is a case in pointó explained in templatic terms by McCarthy and Prince, due, in our view, to general economy considerations. Another example is provided by derived verbs VII, VIII and X, treated as templatic in Autosegmental Morphology, accounted for much more simply, I believe, in terms of affixation plus default syllabification. Most of the core ëinternalí morphology of Arabicóderived verb stems II, III and IX, the active participle of the base verb, diminutives, some internal pluralsócan be analyzed in terms of a single basic affixation rule combined with apophonic changes. For some categories, such as the noun plural, both affixation and mapping to a template are available as word-formation strategies, leading to a situation of allomorphy. Where templatic morphology is clearly motivated in Arabic, the fixed output-template may function as a primary indicator of a morphological category, but also, as in other languages which have been identified as having templatic morphology, it may simply function as a stem allomorph triggered by an affix or set of affixes. In short, it has become clear that the difference between templatic (output consistent) and rule-based, generally affixational (change consistent) morphology is not an absolute difference characterizing different language types. Rather, both types of morphology may be found within the morphological system of one and the same language; and templatic morphology is a possibility of language not of some languages. Within the context of the Autosegmental
Classical Arabic Re-considered ! 261
Morphology Hypothesis, the possibility of templatic morphology in Arabic is accounted for by postulation of language particular idiosyncrasies of lexical and phonological structureónamely that the lexicon lists abstract consonantal roots rather than words and that word forms in Arabic have underlying representations in which vowels and consonants are represented on separate tiers. An approach based on assuming absolute differences in the underlying phonological organization of languages to account for scalar differences in the degree to which templatic morphology is used (or not used) is in principal undesirable. I have argued that it is also empirically inferior to the theoretically less idiosyncratic alternative which dispenses with consonantal root morphemes and morpheme planes. I have argued that the Arabic lexicon is word-based and that templatic morphology is made possible by universal elements of phonological structureónamely the fact that the phonological strings that make up words are composed into syllables which in turn consist of peaks and troughs of sonority. In templatic morphology there is often a mismatch between the unspecified timing positions of the template and the segmental material available to fill these positions. In such cases filling or deletion processes are called into play. Autosegmental approaches have attempted to account for these filling and deletion processes in terms of a default directional spreading parameter. Research on a variety of templatic systems has provided at best limited empirical support for and much counter evidence against default directional spreading. The Arabic data upon which the claim of default L-to-R spreading was originally based, turns out upon closer inspection to provide no evidence in support of the claim. In place of default directionality we have argued that the filling and deletion processes found in templatic and quantity-based morphology can be accounted for by principles independently required by phonological theory. Morphologically triggered coda filling by spreading of an adjacent segment is identical to compensatory lengthening. The existence of three possible strategies, sometimes even within one language, for onset filling (by phonological default, by insertion of a pseudoaffix, and by copying or reduplication) follows from the fact that words may be segmented in at least three different ways on a language universal basis. The three different strategies for deleting segments to satisfy templates in Classical Arabic, the existence of which are problematic under the assumption of default
262 ! Robert R. Ratcliffe
directionality, mirror the onset filling strategies and follow from the same universal of human language.
Notes 1. See Suleiman 1995 for a good recent survey of the tradition and further references. 2. Cf. Odden (1988: 451): ëThe possibility that vowels and consonants occupy separate tiers on a language specific basis radically expands the power of phonological theory and predicts unattested patterns of inalterability and across-the-board rule application.í 3. Taking these analyses into account McCarthy (1989) argues that templatic morphological systems will always exhibit planar V/C segregationóbecause planes render linear order relations redundant. Thus, the notion of consonant strings as root morphemes is no longer an essential part of the theory, although McCarthy (1989) still assumes that C strings are morphemes in Arabic. 4. The notion that the lexicon is the repository of idiosyncratic information I take to be an axiom of modern formal linguistics (Anderson 1992: 48ñ72, Aronoff 1976,1994,16ff., cf. Bloomfield 1933: 274, Carstairs-McCarthy 1992: 11ñ51, Chomsky 1965: 87, Di Sciullo and Williams 1987, Spencer 1991: 81ñ91). 5. The Arabic verb occurs in two basic conjugational forms, termed by the Arab grammarians al-maŒii (ëthe pastí), and al-muŒaari (ëthe resemblingí form, because its inflections resemble those of the nouns), and labelled here the ëpastí and ënon-pastí, respectively. Comparative Semiticists in the nineteenth century reanalyzed the distinction as one of aspect (perfect/imperfect), and this distinction is still maintained by many Arabists and by most of the standard reference grammars. I am personally convinced by the arguments of Kurylowicz (1972) and Zaborski (1995) that the distinction here is fundamentally one of tense and only secondarily one of aspect, and hence I adopt the past/non-past terminology. In any case the issue of the semantics and syntax of the forms has only marginal bearing on the present discussion. 6. The vocalism of the past stem is more often predictable from that of the non-past stem than vice versa. Further, the non-past stem of the form I or underived verbs shows a variety of syllabic shapes (-CvC, -CvvC, and -CCvC), consistent with the notion that it is a lexical entry, according to the principle that the lexicon contains
Classical Arabic Re-considered ! 263
7.
8.
9.
10.
11.
12.
13.
idiosyncratic information about words. The past shows a consistent (templatic) pattern (CvCvC-), which is consistent with the idea that this is a derived form, whose shape can be accounted for by rule. This argument has frequently been made, see Heath (1987), McComber (1995), Schramm (1962, 1991) and Voigt (1987). Since prosodic circumscription explicitly allows reference to a phonologically, rather than morphologically, defined part of a word, the fact that the initial heavy syllable of verbs is heteromorphemic (person/number/gender prefix ya-, ta- etc., + first consonant of stem CCVC) presents no difficulty for the analysis. The affixation of a bimoraic sequence to a bimoraic heavy syllable yields a trimoraic output, because the coda mora of the first syllable is forced to resyllabify as an onset in the output. In order to reconcile this analysis with the existence of verbal adjective CaCiiC, CaCuuC and verbal nouns CiCaaC, CaCaaC, we must assume that these deverbal nouns are derived from the perfect verbal stem, which again is plausible on semantic grounds. ëFrom the standpoint of the theory of Prosodic Morphology ... the Arabic verb is quite puzzling. The verb forms are prosodically incoherent and difficult to rationalize with anything like the Iamb Rule or the Prosodic Morphology Hypothesisí (McCarthy 1993: 201). Further we might note the anomalous fact that deletion of the Cvprefix of the non-past to form the past triggers prothesis in the case of the derived stems, but epenthesis in the case of the base form. It is only this and a difference in vowel quality that distinguishes stem IV (yuf ilu/ af ala) from the base verb. Taking prothesis as the unmarked option, a default syllabification analysis, rather than a templatic analysis, is available for stem IV, as well. Goldsmith (1990: 123) suggests the following universal principal of syllabification: ëWe shall take syllabification to apply ... to construct syllable structure in a minimal fashion (i.e., with the minimal number of syllables) to cover the maximum number of segments possible.í Some variation in the form of the past stem is due to the effects of a late phonological syncope rule deleting a vowel between identical consonants or a glide between short vowels having the same feature specification for height, thus qaala < *qaGala, dalla < *dalVla, Angoujard (1990 53ff.), cf. Brame (1970), Levy (1971: 118ff.), Moore (1990). Statistical studies (Levy 1971, Murtonen 1964) have shown, contrary to the assumptions in traditional grammars, that the distribution of plural allomorphs is largely predictable on the basis of the form of the singular. The organization of the system presented here follows from that in Ratcliffe (1992, 1998b). A handful of rare and anomalous patterns are left out. I restrict the analysis to strictly formal
264 ! Robert R. Ratcliffe
14.
15.
16.
17.
18.
19.
20.
factors, although marginally greater predictability can be achieved by indicating some semantic features (animacy of referent, e.g.,) of the singular. This competition and variability is perhaps easier to accommodate in a diachronic account, (see Ratcliffe 1992,1996,1998b) than in a synchronic one. The less productive forms of the broken plural can be interpreted as representing old innovations by one dialect group or one generation of speakers which have become fossilized or lexicalized. The derivations offered in (15)ñ(18) would thus reflect true historical processes of analogy. Specifically, we cannot assume that autosegmental association explains anything about the phonological structure of underived words. Association to a template takes place only in derivation. Lowenstamm and Kaye argue that gemination is preferred universally (which would be default right-to-left spreading). This argument is based on comparison of the form II or D-stem verb in Arabic (gemination universal) with its Hebrew cognate (vowel lengthening substitutes for gemination under some conditions). But the same argument could be made for vowel lengthening as a default on the basis of the noun plural forms in Arabic and Tigre. Arabic has a plural form CaCaaCiC- in which the heaviness of the second syllable is always created by a long vowel. The cognate Tigre form may have a vowel corresponding to Arabic long /aa/ in the second syllable, or alternatively may have a correspondent of Arabic short /a/ plus gemination of the following consonant: Ar. anaakib Ti. anakk b ëspidersí. In two marginal cases in the noun morphology empty onset positions are filled by right-to- left spreading (plurals baabun >> biibaanun and daabiix >> dabaabiix). Our analysis of these forms assumes that the underlying stem of baabun is baab- and that the /w/ is a default consonant required by template satisfaction and not part of the underlying form. McCarthy (1986: 210) actually acknowledges this: ëThe OCP forces us to say that there is also spreading of the single root consonant m in samam, even though this verb and all forms relating to it invariably have at least two mís on the surface. This in itself is remarkable, since it means that the OCP actually demands a certain measure of abstractness in phonological representations even when unsupported by alternations.í In other words, the OCP or the L-to-R default can not actually be postulated on the basis of the Arabic data. But if we have some reason to assume that the OCP and L-to-R are universals of human language, then a certain analysis imposes itself on the Arabic data. Dell and Elmedlaoui (1992: 91ñ94) make the same argument against a biconsonantal interpretation of words with geminate consonants
Classical Arabic Re-considered ! 265 in Berber. Some of the pairs in (32) only contrast minimally on the level of the ëconsonantal rootí, since at the word level differences in vowel quality or the presence or absence of affix -at may distinguish the words. Of course in the original theory it was precisely at the root level that a distinction between e.g., µ h -m and µ h -m-m or s-m and s-m-m was supposed to be impossible. 21. Phonetically /ismun, with prothetic /i-
References Akinlabi, Akinbiyi and Eno Urua. 1993. ëProsodic target and vocalic specification in the Ibibio verbí. In The Proceedings of the Eleventh West Coast Conference on Formal Linguistics, ed. by Jonathan Mead, pp. 1ñ14. Stanford, Calif.: Stanford Linguistic Association. Anderson, Stephen. 1992. A-Morphous Morphology. Cambridge: Cambridge University Press. Angoujard, Jean Pierre. 1990. Metrical Structure of Arabic. Dordrecht: Foris Publications. Archangeli, Diana. 1988. Underspecification in Yawelmani Phonology and Morphology. New York: Garland Publishing. óóó. 1991. ëSyllabification and prosodic templates in Yawelmanií. NLLT 9: 231ñ83. Aronoff, Mark. 1976.Word Formation in Generative Grammar. Cambridge, Mass.: MIT Press. óóó. 1994. Morphology by Itself. Cambridge, Mass.: MIT Press. Bat-el, Outi. 1986. ëExtraction in modern Hebrew morphologyí. Unpublished M.A. thesis, Los Angeles: UCLA. óóó. 1989. Phonology and Word Structure in Modern Hebrew. Doctoral dissertation. Los Angeles: UCLA. óóó. 1994. ëStem modification and cluster transfer in modern Hebrewí. NLLT. 12: 571ñ96. Beard, Robert. 1995. Lexeme-Morpheme Base Morphology. Albany: SUNY Press. Bloomfield, Leonard. 1933. Language. New York: Henry Holt. Brame, Michael. 1970. ëArabic phonology: Implications for phonological theory and historical semiticí. Doctoral dissertation. Cambridge, Mass.: MIT Press. Broselow, Ellen. 1984, ëDefault consonants in Amharic phonologyí. In Papers from the January 1984 MIT Workshop in Morphology, Vol. 7, ed. by M. Speas and R. Sproat, pp. 15ñ32. Cambridge, Mass.: MIT Dept. of Linguistics.
266 ! Robert R. Ratcliffe Broselow, Ellen. 1995. ëSkeletal positions and morasí. In The Handbook of Phonological Theory, ed. by Goldsmith, pp. 175ñ205. Oxford: VK/Cambridge, Mass.: Basil Blackwell. Buckley, Eugene. 1990. ëEdge-in Association and OCP ìViolationsî in Tigrinyaí. In The Proceedings of the Ninth West Coast Conference on Formal Linguistics, ed. by Aaron L. Halpern, pp. 75ñ90. Stanford, Clif.: Stanford Lingusitic Association. Carstairs-McCarthy, Andrew. 1992. Current Morphology. London: Routledge. Chomsky, Noam. 1965. Aspects of a Theory of Syntax. Cambridge, Mass.: MIT Press. Chomsky, Noam and Morris Halle. 1968. The Sound Pattern of English. New York: Harper and Row. Dell, François and Mohamed Elmedlaoui. 1992. ëQuantitative transfer in the nonconcatenative morphology of Imdlawn Tashlhiyt Berberí. Journal of Afroasiatic Languages. 3(2): 89ñ125. Di Sciullo, Anna Maria and Edwin Williams. 1987. On the Definition of Word. Cambridge, Mass.: MIT Press. Eid, Mushira and John McCarthy, eds. 1990. Perspectives on Arabic Linguistics II. Amsterdam/Philadelphia: John Benjamins. Farwaneh, Samira. 1990. ëWell-Formed Associations in Arabic: Rule or Condition?í In Perspectives on Arabic Linguistics II, ed. by Eid and McCarthy, pp. 120ñ42. Amsterdam/Philadelphia: John Benjamins. Frajzyngier, Zygmunt. 1977. ëThe Plural in Chadicí. In Papers in Chadic Linguistics, ed. by Paul Newman and Roxana Ma Newman, pp. 37ñ56. Leiden: Afrika-Studiecentrum. Goldsmith, John. 1990. Autosegmental and Metrical Phonology. Oxford, UK/ Cambridge, Mass.: Basil Blackwell. óóó. ed. 1995. The Handbook of Phonological Theory. Oxford, UK/Cambridge, Mass.: Basil Blackwell. Greenberg, Joseph H. 1950. ëThe patterning of root morphemes in semiticí. Word. 6: 162ñ81. Halle, Morris. 1973. ëProlegomena to a theory of word-formationí. Linguistic Inquiry. 6: 6ñ36. Halle and Mohanan. 1985. ëSegmental phonology of modern Englishí. Linguistic Inquiry. 16. 22: 457ñ501. Hammond, Michael. 1988. ëTemplatic Transfer in Arabic broken pluralsí. NLLT. 6: 247ñ70. óóó. 1993. ëHeavy Trochees in Choctaw Morphologyí. Phonology. 10: 325ñ36. Heath, Jeffrey. 1987. Ablaut and Ambiguity: Phonology of a Moroccan Arabic Dialect. Albany: SUNY Press. Hobermann, Robert. 1988. ëLocal and long-distance spreading in semitic morphologyí. NLLT. 6: 541ñ49.
Classical Arabic Re-considered ! 267 Hobermann, Robert. 1992. ëLocal Spreadingí. Journal of Afroasiatic Languages. 3(3): 226ñ55. óóó. 1995. ëCurrent issues in semitic phonologyí. In The Handbook of Phonological Theory, ed. by Goldsmith, pp. 839ñ47. Mass.: Basil Blackwell. Hudson, Grover. 1985. ëArabic root and pattern morphology without tiersí. Journal of Linguistics. 22: 85ñ122. óóó. 1995. ëPhonology of Ethiopian Languagesí. In The Handbook of Phonological Theory, ed. by Goldsmith, pp. 782ñ97. Mass.: Basil Blackwell. Keegan, John M. 1987. ëMorpheme-based Morphologyí. Penn Review of Linguistics. 61ñ75. Kiparsky, Paul. 1968. ëHow abstract is phonologyí. Reprinted as ëAbstractness, opacity, and global rulesí [1973]. In Three Dimensions of Linguistic Theory, ed. by O. Fujimura, pp. 1ñ136. Tokyo: Taikusha. Kisseberth, Charles. 1970. ëVowel elision in Tonkawa and derivational constraintsí. In Studies Presented to Robert B. Lees by his Students, ed. by J.M. Sadock and A.L. Vanek. Champlain: Linguistic Research, Inc. Kurylowicz, Jerzy. 1972. Studies in Semitic Grammar and Metrics, Wroclaw. Levy, Mary M.1971. The Plural of the Noun in Modern Standard Arabic. Doctoral dissertation. Ann Arbor: University of Michigan. Lieber, Rochelle. 1987. An Integrated Theory of Autosegmental Processes. Albany: SUNY Press. óóó. 1995. Deconstructing Morphology: Word Formation in Syntactic Theory. Chicago: University of Chicago Press. Lombardi, Linda and John McCarthy. 1991. ëProsodic circumscription in Choctaw phonologyí. Phonology. 8: 37ñ71. Lowenstamm, Jean and Jonathan Kaye. 1986. ëCompensatory lengthening in Tiberian Hebrewí. In Studies in Compensatory Lengthening, ed by L. Wetzels and E. Sezer, pp. 97ñ146. Dordrecht: Foris. Marantz, Alec. 1982. ëRe Reduplicationí. Linguistic Inquiry. 13: 435ñ82. McCarthy, John. 1979. Formal Problems in Semitic Phonology and Morphology. Doctoral dissertation. Cambridge, Mass.: MIT Press. óóó. 1981. ëA prosodic theory of nonconcatenative morphologyí. Linguistic Inquiry. 12: 373ñ418. óóó. 1983. ëA prosodic account of Arabic broken pluralsí. In Current Trends in African Linguistics I, ed. I. Dihoff, pp. 289ñ320. Dordrecht: Foris. óóó. 1989. ëLinear order in phonological representationsí. Linguistic Inquiry. 20: 71ñ99. óóó. 1993. ëTemplate form in prosodic morphologyí. In Papers from the Third Annual Formal Linguistics Society of Midamerica Conference, ed. by L. Smith Stvan, pp. 187ñ218. Bloomington: IULC. McCarthy, John and Alan Prince. 1990a. ëProsodic morphology and templatic morphologyí. In Perspective on Arabic Linguistics II, ed. by Eid and McCarthy, pp. 1ñ54. Amsterdam/Philadelphia: John Benjamin.
268 ! Robert R. Ratcliffe McCarthy, John and Alan Prince. 1990b. ëFoot and word in prosodic morphology: The Arabic broken pluralí. NLLT. 8: 209ñ83. óóó. 1995. ëProsodic morphologyí. In The Handbook of Phonological Theory ed. by Goldsmith, pp. 318ñ66. Oxford: UK/Cambridge, Mass.: Basil Blackwell. McOmber, Michael. 1995. ëMorpheme edges and templatic redundancyí. In Perspectives on Arabic Linguistics VII, ed. by Mushira Eid, pp. 173ñ190. Amsterdam/Philadelphia: John Benjamins. Schramm, Gene. 1962. ëAn outline of Classical Arabic verb structureí. Language. 38(4): 360ñ75. óóó.1991. ëSemitic morpheme structure typologyí in Semitic Studies in Honor of Wolf Leslau on his Eighty-Fifth Birthday, ed. by Alan Kaye, pp. 1402ñ8. Wiesbaden: Otto Harrassowitz. Moore, John. 1990. ëDoubled verbs in modern standard Arabicí. In Perspectives on Arabic Linguistics II, ed. by Eid and McCarthy, pp. 55ñ93. Amsterdam/Philadelphia: John Benjamins. Murtonen, A. 1964. Broken Plurals, the Origin and Development of the System. Leiden: E.J. Brill. Noske, Roland. 1985. ëSyllabification and syllable changing processes in Yawelmanií. In Advances in Non-linear Phonology, ed. by Harry van der Hulst and Noval Smith, pp. 335ñ62. Dordrecht: Foris. Odden, David. 1988. ëAntigemination and the OCPí. Linguistic Inquiry. 19: 451ñ475. óóó. 1995. ëTone: African Languagesí. In The Handbook of Phonological Theory, ed. by Goldsmith, pp. 444ñ75. Mass.: Basil Blackwell. Ratcliffe, Robert. 1990. ëArabic broken plurals: Arguments for a two-fold classification of morphologyí. In Perspectives on Arabic Linguistics II, ed. by Eid and McCarthy, pp. 94ñ119. Amsterdam/Philadelphia: John Benjamins. óóó. 1992. The Broken Plural Problem in Arabic, Semitic and Afroasiatic: A Solution Based on the Diachronic Application of Prosodic Analysis. Doctoral dissertation. New Haven, Ct.: Yale University. óóó. 1996. ëDrift and noun plural reduplication in Afroasiaticí. Bulletin of the School of Oriental and African Studies. 59(2): 296ñ311. óóó. 1997. ëTemplatic morphology in English: -ought/aught verbs and -ould verbsí. Proceedings of the Thirteenth Japan English Linguistic Society Conference. óóó. 1998a. ëProsodic templates in a word-based morphological analysis of Arabicí. In Perspectives on Arabic Linguistics X, ed. by Mushira Eid and Robert Ratcliffe. Amsterdam/Philadelphia: John Benjamins. óóó. 1998b. The ëBrokení Plural Problem in Arabic and Comparative Semitic: Allomorphy and Analogy in Non-concatenative Morpohology. Amsterdam/ Philadelphia: John Benjamins.
Classical Arabic Re-considered ! 269 Smith, Norval. 1985. ëSpreading, reduplication and the default option in Miwok nonconcatenative morphologyí. In Advances in Non-linear Phonology, ed. by Harry van der Hulst and Noval Smith, pp. 363ñ80. Dordrecht: Foris. Sommerstein, Alan H. 1974. ëOn phonotactically motivated rulesí. Journal of Linguistics. 10: 71ñ94. Spencer, Andrew: 1988, ëMorpholexical rules and lexical representationsí. Linguistics. 26: 619ñ40. óóó. 1991. Morphological Theory. Oxford/Cambridge, Mass.: Basil Blackwell. Stonham, John T. 1994. Combinatorial Morphology. Amsterdam/Philadelphia: John Benjamins. Suleiman, Yasir. 1995. ëArabic linguistic traditioní. In Concise History of the Language Sciences, ed. by E.F.K. Koerner and R.E. Asher, pp. 28ñ40. Cambridge: Cambridge University Press. Ulrich, Charles H. 1994. ëA unified account of Choctaw intensivesí. Phonology. 11: 325ñ39. Voigt, Rainer. 1987. Die infirmen Verbaltypen des Arabischen und das Biradikalismus-Problem. Stuttgart: Franz Steiner Verlag. Wright, William, ed. and trans. 1896. A Grammar of the Arabic Language. 3rd. edition. London: Cambridge University Press. Yip, Moira. 1988. ëTemplate morphology and the direction of associationí. NLLT. 6: 551ñ77. Zaborski, Andrzej. 1995. ëKurylowicz and the so-called ëaspectí in Classical and Modern Arabicí. In Kurylowicz Memorial Volume, Part 1, ed. by W. Smoczynski, pp. 529ñ41. Cracow: University of Cracow.
270 ! Thomas Becker
10 Paradigmatic Morphology Thomas Becker ëSeamless Morphologyí, proposed by Ford et al. (1997) and by Singh and Dasgupta (1999), is a radical departure from the mainstream syntactical approach to morphology, called ëword syntaxí. In the following article I want to show that the most important step in this departure is the move from syntagmatic to paradigmatic rules. Paradigmatic rules (or ëstrategiesí, Ford et al. 1997: 1) describe paradigmatic relations between words instead of the syntagmatic structure of words. The term ëparadigmaticí only indirectly refers to ëparadigmsí; it is to be understood in its opposition to ësyntagmaticí, which approximately corresponds to de Saussureís opposition between ërapports syntagmatiques et rapports associatifsí (1916: 170ñ75). It should be pointed out that a ëWord and Paradigmí model of morphology (cf. Matthews 1972) is syntagmatic when it derives inflectional forms from underlying stems. The ëparadigmaticityí of Seamless Morphology is independent from other aspects (e.g., ëseamlessnessí), which seem to be debatable.
The Paradigmatic Rule Format Paradigmatic rules known as ëWord Formation Rulesí1) relate inflected word forms, which are taken to be lexical entries. The words write and writer in (1ab) are related by the rule (2):2 (1a)
Write V ëto trace symbols on a surfaceí
(1b)
Writer N ëone who traces symbols on a surfaceí
Paradigmatic Morphology ! 271
(2)
X V ëto ví
t
Xer N ëone who vísí
The corresponding word syntactical analysis would look like (3): (3)
N V
N
write
-er
In (3) the relation between write and writer is syntagmatic: write is a constituent of writer. A paradigmatic rule like (2) serves two purposes: it describes the morphological relation between existing words, and the formation of new words. Two words, are related by a rule if and only if they satisfy the input and the output structure of the rule with a consistent interpretation of its variables (e.g., write and writer for X = ëwriteí and v = ëtrace symbols on a surfaceí). The formation of writer on the basis of write is described as follows: When the basis write is put into the input structure (or ëunifiedí with it), the variables are interpreted as above and the output structure turns into the lexical entry of writer. The paradigmatic rule explicates the traditional concept of proportional analogy. According to Paul (1880: 97), new words are formed by analogy to existing models. The formation writer is the solution of the analogical proportion in (4):3 (4) speak : speaker = write : X
(X = writer).
The identification of proportional analogy and morphological rules (as advocated by Becker 1990) sheds light on both concepts. Analogy is not a mysterious antagonist of sound laws (that sometimes makes grammars more economical and sometimes doesnít) but the natural consequence of the interaction of phonology and morphology: speakers form new words ignoring (former) phonotactic restrictions. Morphology, on the other hand, is not word syntax plus a mysterious analogical module to cover those formations that cannot be treated syntactically (as back-formations, etc.). When those analogical formations are treated properly, concatenative formations as compounds or affixational derivations, the domain of word syntax, turn out to be merely trivial cases of analogical formations.
272 ! Thomas Becker
In the following, the major features of paradigmatic morphology will be discussed.
The Role of Words From a paradigmatic point of view, words are minimal signsóthe minimal units into which utterances are decomposed by the speakers in order to be stored as building-blocks for their own utterances. This does not mean that words cannot be decomposed further. Linguists can do so and have done so, and they have defined the morpheme as the minimal unit they can analyze. Speakers, however, do not seem to analyze utterances to a maximal extent but rather to an optimal extent for the purposes of language use. Different conditions obtaining in the various languages lead to different results, and in some cases it just does not matter how deep the analysis goes. Therefore, the concept of word is notoriously difficult to define although most speakers have clear intuitions. The term need not be defined, however, as it is a basic concept of morphological theory. As minimal signs, words are autonomous individuals and do not depend on their constituents; therefore, they tend to be lexicalized, that is, they develop their own meaning, lose the properties that indicate their structure (formal fusion) and may lose their relationship to free occurrences of their ëconstituentsí. Lexicalization is the normal case for word-formation, compositionality being the exception, as compositionality is the normal case for syntactic constructions, lexicalization (idioms) being the exception there. Lexicalization of complex words (or ëinstitutionalizationí, a term which Bauer (1983: 48) and others prefer for the more subtle cases) is far more widespread than commonly accepted; a front door , e.g., is not any door that happens to be in the front of a house but only the principal entrance-door of the house. Even in German, which makes excessive use of compounding compared to English, almost all of the compounds that are actually used are institutionalized and real neologisms do not remain unnoticed by an attentive listener. Furthermore, even a German compound needs a ëjustificationí: any fast car can be described by the phrase ein schnelles Auto, but it cannot be called ein Schnellauto (a ëfast-carí) unless it is relevant to coin a term for this type of car in the situation of utterance. A syntactic phrase can describe any category relevant or not, whereas
Paradigmatic Morphology ! 273
a compound nominates a category which must be relevant in order to deserve a name. The phenomenon of ëblockingí should be seen in this context as well: despite the productivity of the rule (2), the formation stealer is blocked by the prior existence of the word thief. In word-formation there seems to be a maxim of parsimony: ëDo not multiply your words beyond necessityí. This maxim would be incomprehensible if the set of words were generated; there is no such maxim in syntax since a ënewí phrase is far less an innovation than a new word.
The Irrelevance of Structure The autonomy of words, i.e., the independence of their constituents, is also shown by words which are ill-formed and, nevertheless, acceptable. In German, the suffix -frei is denominal (nikotinfrei ënicotinefreeí), but there are some two or three deverbal forms as bügelfrei (ëironing-freeí, ënon-ironí, knitterfrei, ëcrease-freeí, ënon-creasingí, of shirts). These words should at least be odd, but they are not. Other formations like Lächler ësmilerí are peculiar although they are perfectly well-formed and attested. Word-formation rules are not as productive as syntactic rules and violations are more acceptable. Morphological structureóif it is relevant at allóis far less important than syntactic structure. The reason for that is: that new words normally are derived immediately from usual words, that is by one step only. A two-step derivation seems to be considerably harder to parse than a sentence made of three words. It has been shown that a word-formation rule may be sensitive to the morphological structure of the basis (un- may not attach to bases derived with the affix dis- immediately before); it has also been shown that only the immediately preceding derivational affix can be relevant, but not the structure of the basis further inside.4 In word syntax, the relevance of the internal structure has to be ruled out by a rather awkward stipulation as Siegelís (1977) ëadjacency conditioní; it is awkward because morphology is claimed to be the theory of the structure of words. The immediate structure of a word can be relevant for further wordformation and for the use of the word, but does one really have to know that the word pineapple juice is made from pineapple and juice in order to use the word? If the answer be yesódoes one have to know what a pine is?
274 ! Thomas Becker
The structure of words, that is the analysis of a complex word down to its root, is a rather academic matter and not pertinent to language use.5 Therefore, the task of morphology has to be defined differently: Morphology does not describe the structure of words but rather the procedures the speakers use to extend their lexicon; the morphological structures of actual words are relevant only insofar as they serve as models for new formations. New words are formed in the same way as (or in analogy to) those models. The type of morphological rules as (2) above are designed exactly to accomplish that task. The input structure is an analysis of the basis, possibly a morphological structure, and it ignores the structure of the basis further inside (which can, but need not, be analyzed by further rules); its irrelevance need not be stipulated. Other aspects, like the ëheadednessí of words, get lost. A closer look, however, reveals that it is of no importance. The putative category N of the affix -er in the structure (3) does not have any empirical foundation because affixes by definition do not occur as free forms (like write which can be proved to be a verb). It is arbitrarily called N in order to make it look like a head.6
Non-concatenative Compounding and Affixation Non-concatenative rules like vowel change or stress change are often used to show that morphology is not syntax; these arguments will not be repeated here. It can be shown that even compounding and affixation is non-concatenative in some cases. The most important function that a word structure like (3) fulfils is to represent the ëmotivationí of the word writer, which is not an arbitrary sign like tree, but motivated by the verb to write which is taken to be a constituent of writer. In word syntax it is implicitly understood that all complex words are motivated by their constituentsówhich is false. There are not only numerous back-formations which cannot be given a word structure, there are also many derived words which only appear to be derived from their constituents; frogwoman or firewoman are not formed on the basis of frog, fire and woman, they are motivated by the corresponding compounds of -man. Compounds like airmanship, airscape, air-sick, airman and many others are formed after compounds of sea-. These examples are rather marginal of course. In German, however, there are some
Paradigmatic Morphology ! 275
fairly productive derivational rules which turn out to be replacive on closer examination. Complex verbs can be nominalized by substituting a suppletive nominalization for the verbal stem: herausgeben ëto hand overí is nominalized as Herausgabe ëhanding overí; there is no synchronic rule to relate geben and Gabe. There are several dozen suppletive pairs like this; it is conspicuous that the meanings of many complex nominalizations are closer to the corresponding complex verbs than to the lexicalized simplex nouns they appear to be derived from. Words can also be formed by the substitution of an affix; quite a number of complex verbs in German have a meaning that actually contradicts the meaning of the stem: the verb ausspannen ëto unyoke (of oxen)í or ëto take out (sheet of paper from a typewriter)í literally means ëto tighten outí; this word can only be understood as a formation on the basis of einspannen ëto tighten iní, ëto yoke (oxen)í, ëto insert (sheet of paper)í.7 A syntactical analysis of these words is impossible, a paradigmatic analysis is straightforward: (5)
einXen V ëto ví
t
ausXen V ëto un-ví, ëto reverse ví
Word-formation rules like (2) and (5) can also strip the base of parts of its meaning. That is necessary in order to describe cases of paradigmatically determined allomorphy as the formation of French adverbs on the basis of the feminine adjective (cf. Booij 1997: 44f.); the adverb forming suffix used to be a feminine noun: (6) masc. adj. faux ëfalseí lent ëslowí beau ëbeautifulí fou ëstupidí
fem. adj. fausse lente belle folle
adverb faussement lentement bellement follement
The suppletive cases clearly show that this allomorphy is paradigmatically determined and cannot be dealt with in any kind of ëphonologicalí component. In Italian, one type of action noun is not derived from the unmarked stem but by conversion from the feminine past participle (cf. Rainer 2001):
276 ! Thomas Becker
(7)
infinitive camminare ëto walkí godere ëto enjoyí spremere ëto pressí
fem. past part. camminata goduta spremuta
action noun camminata ëwalkí goduta ëorgasmí spremuta ëpressingí, ëjuiceí
The Italian verb-noun compounds of the type portachiavi ëkey ringí (lit. ëcarry-keyí) are formally derived from the imperative and can be derived neither from the infinitive nor from the third person singular (cf. Rainer 2001): (8) infinitive portare tendere coprire pulire intrattenere
3rd p. sg. porta tende copre pulisce intrattiene
imperative porta tendi copri pulisci intrattieni
compound portachiavi tendicinghia copricapo puliscipiedi intrattienigente
In word syntax, which only allows additive morphology, these compounds are a problem (cf. Vogel 1993: 237): Although claiming that this form (imperative, TB) is the base of the compounds allows us to generalize across the three conjugation classes (including subclasses, TB), there is nothing in the compounds that motivates this analysis since the compounds do not exhibit any syntactic and/or semantic properties of the imperative. In paradigmatic morphology, the bases of word-formation are inflected word forms whose special inflectional properties are stripped off in the derivation. The base can show an allomorph whose formal properties are carried over to the derivate without the semantic properties attached to the form of the base (note that the noun can be both singular or plural):8 (9)
t
XY N ëone who vís Pí
Paradigmatic Morphology ! 277
From the complex semantics of the first input structure (ëv!í) only the lexical part (ëví) is carried over to the output, whereas the imperative part (ë!í) is stripped off. Another rule of this type can be found in German compounding. Preposed genitives turned ungrammatical (with some systematic exceptions) and were reanalyzed as left compound constituents:9 (10)
Kalbs F¾ Rinder Stall
> >
Kalbsf¾ Rinderstall
ëcalfís footí ëcowsí shedí
The suffixes for genitive singular and for plural (the genitive is unmarked in the plural) were reanalyzed as ëlinking elementsí and lost their semantics. By analogy, the former genitive -s was extended to words that do not use -s as a genitive suffix (e.g., feminine nouns) and the former plural suffixes are used in compounds even if their semantics contradict the meaning of the whole word, as Rinderzunge ëox tongueí, lit. ëoxenís tongueí. Words with irregular plurals show that it is actually the plural form which is used as left constituent (Mastodontenzahn, ëmastodons (pl.!) toothí, is the only word with the plural -ten).
Productivity and the Role of Models Morphological rules are not transmitted from one generation to the other; the speakers have to abstract their own morphology from the utterances they have heard. A rule like write ® writer is based on, and abstracted from, model pairs of words which are shaped according to the same pattern. In non-paradigmatic morphology, the dependence of a formation on such models has been conceded only in odd cases like back-formations, which are attributed to analogy by Bauer (1983: 64) and Spencer (1991: 461), among others. However, there is no reason to assume that a formation should be based on models only if it is not affixational and that the large number of models for a regular affixational formation should be irrelevant. The difference between rules and so-called ëanalogiesí is not a formal one as the proportional formula or the rule type (2) covers both kinds of formations. Their difference consists entirely in a difference of productivity. The so-called ëanalogiesí are rules of low productivity, and rules are productive analogies. Since hardly any word-formation rule has the unrestricted productivity of a rule of syntax, the difference is a gradual one.
278 ! Thomas Becker
Therefore, a paradigmatic theory has to include a theory of productivity, which has been developed to some extent although not yet to a satisfactory extent. Some important factors that influence the productivity of a rule were mentioned by Paul (1880: 96f.): semantic and formal transparency, and type or token frequency. Another important factor is ëmorphological naturalnessí, which is a rather heterogeneous concept unless it is regarded in the context of productivity. A rule which forms plurals out of singulars is more productive than the inverse rule (ësystem-independentí naturalness), a rule that relates words of different inflectional classes in the same way10 is more productive than one that applies only to one inflectional class (ësystem-dependentí naturalness), and iconic rules are more productive than non-iconic ones, e.g., affixation is more productive than subtraction (cf. Dressler et al. 1987). Kiparskyís (1974: 259) example used in his argument against the proportional formula can be explained in more than one way: For example, we do not expect a new lexical item *heye meaning ëto seeí to arise from the proportion ear : hear = eye : x, though this proportion cannot without the grammar be distinguished from that which generated an actual analogical form such as brothers, e.g., sister : sisters = brother : x. Apart from the frequency factor (a rule supported by one model is less productive than a rule supported by many models), the unsyllabic prefix h- is not supported by the English system and, therefore, for speakers hard to detect, which can be explained by the theory of ësystem-dependentí naturalness. Analogies based on a single model do occur, e.g., Greek Zen˙s ëZeus, genitiveí: Zeús ëZeus, nominativeí after men˙s : meús ëmonthí (Hermann 1931: 67). Another example is Sturtevantsí (1947: 97) sonís ear : irrigate = nose : nosigate which, despite of its charm, was not accepted by general usage, however. The most general factor (which is influenced by all the others) is the conventionality of the speakers. Each speaker tries to conform to the other speakers of the language community and prefers the rules the others prefer.11 An adult speaker of German can learn that the s-Plural is used for neologisms even though other affixes (as -e and -en) are more frequent. For children, on the other hand, it is impossible to know that the many words of the basic vocabulary with the unproductive plural -er are not neologisms and, therefore, they use this rule at times (cf. Clark and Hecht 1982: 6).
Paradigmatic Morphology ! 279
In acquiring a second language, it takes a long time before one develops intuitions about neologisms similar to those of the native speakers of that language, and often even native speakers have different opinions on the acceptability of new formations because intuitions about word-formation are dependent on the actual usage in a community and keep developing until old age. The productivity of a rule depends on the acceptability of its output, which, on the other hand, is dependent on various factors not pertaining to the language system. A formation is more acceptable when it is primed by similar formations or by the basis of the formation. The word ?topwards might sound rather odd; less so, perhaps, when it is used immediately after top, side, and sidewards; the third occurrence of that word is likely to be more acceptable than the first. Productivity is a ëperformanceí phenomenonótherefore a competence model of generative morphology is not able to cover the formations actually performed by the speakers of a language and their intuitive judgements on the acceptability of new words.
Other Characteristics of ëSeamless Morphologyí As mentioned before, the most important characteristic of ëSeamless Morphologyí presented by Ford et al. (1997) and by Singh and Dasgupta (1999) is its paradigmatic nature. Another aspect, the uniform treatment of inflection and word-formation, was subsumed into the discussion of paradigmaticity. If morphology can be paradigmatic (based on analogy), then inflectional morphology must be paradigmatic, as analogical formations within inflectional paradigms are so frequent and obvious that they must be either acknowledged or ruled out by some very basic principles. Another characteristic of ëSeamless Morphologyí, the rejection of inflectional classes called paradigms (Ford et al. 1997: 44), is independent of the paradigmatic nature of morphology. In the theory outlined above, the question is to be phrased like this: Do morphological rules (two-place relations) like (2) ëclusterí to form n-place relations for inflectional classes with n members? There is evidence that speakers normally do not mix inflectional classes by deriving the genitive singular by the rule of one inflectional class and the nominative plural by the rule of another. For example, speakers of German do not form a ëweakí genitive des Menschen and a ëstrongí
280 ! Thomas Becker
plural die Mensche from the nominative der Mensch, which can belong to both declensions. This stability of inflectional classes cannot be guaranteed by two-place relations. On the other hand, at times paradigms do split and mixóbut the conditions of such diachronic developments have not yet been investigated thoroughly. Therefore, the answer to this question has to be put off. The arguments against inflectional classes brought forward by Ford et al. (1997: 48) do not apply to the theory presented here. First, a theory of inflection based on inflectional classes misses important generalizations, for example, the identity of nominative and accusative of neuter nouns in Latin. As pointed out above, identities between rules of different inflectional classes add to each otherís productivity and thus stabilize each other. This seems to be the only relevant aspect of such identities; note that not every valid generalization is a morphological regularity, it must be relevant to the production of words. The other arguments brought forward by Ford et al. (1997: 48ff.) are based on the fact that within inflectional paradigms any form can be derived from any other form and asymmetries of markedness (normally the plurals are derived from the singulars) are mere tendencies. This argument only applies to a syntagmatic ëWord and Paradigmí model of morphology (e.g., Matthews 1972), which describes inflectional forms as derived from underlying stems or particular basic forms; the model presented here covers both the possibility and the markedness of back-formations. A further characteristic of ëSeamless Morphologyí is the bidirectionality of rules. In the theory presented here, the directions can be distinguished by productivity, back-formations normally being less productive. Equal productivity of both directions occurs within inflectional paradigms and with conversion rules and is rare otherwise. However, another asymmetry is more important. Some morphological rules are not injective, that is, a rule can merge different inputs into one output. For example, in German rules with umlaut merge the vowel a and ëunderlyingí e into e; the Middle High German genitive ente ëantí could be the genitive of ent, ant and, furthermore, ente. Such a rule has more than one inverse rule which can differ in productivity. Normally, the formally simplest rule prevails; the Middle High German ant was thus replaced by New High German Ente. The same happened to the New High German words Blüte, Drüse, Hüfte, Säule and others (MHG stat ëplaceí, ëtowní has split into NHG Stadt ëtown, cityí and Stätte ëplaceí). As the two
Paradigmatic Morphology ! 281
directions of rules can differ in productivity and uniqueness (functionality), they should be distinguished and not be merged into one rule. Last but not least, German also provides evidence against seamlessness (Singh and Dasgupta 1999: 319). Morphologically complex words differ in various respects from simplex words. Consonant clusters which are prohibited within morphs are allowed at the juncture between stems and affixes. For example, Papst+tum ëpapacyí, ëpope+domí shows a geminate at the juncture, which is impossible elsewhere. Second, simplex words can only be stressed on one of the last three syllables; endocentric compounds are stressed on the first constituent, therefore, a compound can be stressed on any syllable counted from the end. On the other hand, the weaker form of the claim of seamlessness, the statement that words are sequences of morphs not having a hierarchical structure, does not seem to be challenged by German morphology.12 Since the ëother characteristicsí of ëSeamless Morphologyí, including its name, are rather marginal compared to its paradigmatic character, it proves to be a highly welcome progress in theoretical morphology.
Notes 1. Cf. Aronoff (1976), Scalise (1984) for an introduction. The term ëWord-Formation Ruleí is a misnomer as the paradigmatic character of inflectional morphology is even more obvious. 2. These rather sketchy representations ignore details which are unimportant for the present argument. 3. Paul (1880) did not invent this concept of morphology, he merely gave the first and more or less the last thorough treatment of this approach which dates back to the grammarians of Western antiquity (cf. Best 1973). 4. Scalise (1984: 169ff.), Cf. Siegel (1977) and Williams (1981) for a discussion. 5. The peculiar architecture of Lexical Phonology (Kiparsky 1982), which sends the words cyclically from the morphological to the phonological component and back, is due to the unnecessary analysis of words down to the root.
282 ! Thomas Becker 6. Cf. Becker (1991/92) for a discussion of the notion of ëheadí in morphology. 7. See Becker (1993ab) for a more detailed presentation of replacive formations in German. 8. Pace Ford et al. (1997), compounds can be two-place functions taking two words as their basis, see below. 9. Cf. Becker (1992) for details. 10. E.g., in Latin and other Indo-European languages, nouns with neutral gender have identical nominatives and accusatives, irrespective of number or inflectional class. 11. Conventionality has a sound biological basis with social animals like human beings. 12. Cf. Becker (2000) for further discussion of compounding.
References Aronoff, Mark. 1976. Word-formation in Generative Grammar. Cambridge, Mass.: MIT Press. Bauer, Laurie. 1983. English Word-formation. Cambridge: Cambridge University Press. Becker, Thomas. 1990. Analogie und Morphologische Theorie. Munich: Fink. óóó. 1990/91. ëDo words have heads?í Acta Linguistica Hungarica. 40: 5ñ17. óóó. 1992. ëGerman compoundingí. Rivista di Linguistica. 4: 5ñ36. óóó. 1993a. ëMorphologische Ersetzungsbildungen im Deutschení. Zeitschrift für Sprachwissenschaft. 12: 185ñ217. Dordrecht: Kluwer óóó. 1993b. ëBack-formation, cross-formation, and ìbracketing paradoxesî in paradigmatic morphology. Yearbook of Morphology 1993. 1ñ25. óóó. 2000. ëOn the non-hierarchical structure of compounds: A reply to Singh and Dasguptaí. In The Yearbook of South Asian Languages and Linguistics 2000, ed. by Rajendra Singh, pp. 283ñ92. New Delhi: Sage Publications. Best, Karl-Heinz. 1973. Probleme der Analogieforschung. Munich: Hueber. Booij, Geert E. 1997. ëAllomorphy and the autonomy of morphologyí. Folia Linguistica. 31: 25ñ56. Clark, Eve V. and Barbara F. Hecht. 1982. ëLearning to coin agent and instrument nounsí. Cognition. 12: 1ñ24. de Saussure, Ferdinand. 1916. Cours de linguistique générale. Ed. C. Bally, A. Sechehaye with contribution of Albert Riedlinger. Paris, Lausanne. Critical edition by Tullio de Mauro 1972. Paris: Payot.
Paradigmatic Morphology ! 283 Dressler, Wolfgang U. (ed.). 1987. Leitmotivs in Natural Morphology. Amsterdam: Benjamins. Ford, Alan and Rajendra Singh. 1997. Pace Påƒini. Towards a Word-based Theory of Morphology. New York: Lang. Hermann, Eduard. 1931. Lautgesetz und Analogie. Berlin: Weidmannsche Buchhandlung. Kiparsky, Paul. 1974. ëRemarks on analogical changeí. In Historical Linguistics II. Theory and Description in Phonology, ed. by John M. Anderson and Charles Jones, pp. 257ñ75. Amsterdam: North Holland. óóó. 1982. ëFrom cyclic phonology to lexical phonologyí. In The structure of Phonological Representations, Part I. ed. by Harry van der Hulst and Norval Smith, pp. 131ñ75. Dordrecht: Foris. Matthews, Peter Hugo. 1972. Inflectional Morphology: A Theoretical Study Based on Aspects of Latin Verb Conjugation. Cambridge: Cambridge University Press. Paul, Hermann. 1880. Principles of the History of Language. Translated from the second edition of the original (Principien der Sprachgeschichte, Halle 1886) by Herbert Augustus Strong. New and revised edition 1890. London: Longmans. Reprint 1970: College Park, MD: McGrath. Rainer, Franz. Forthcoming. ëCompositionality and paradigmatically determined allomorphy in Italian word-formationí. In Festschrift for Wolfgang U. Dressler, ed. by Schaner-Wolles, John R. Rennison and Friedrich Neubarth. Torino: Rosenberg and Sellier. Scalise, Sergio. 1984. Generative Morphology. Dordrecht: Foris. Siegel, Dorothy. 1977. ëThe adjacency condition and the theory of morphologyí. Proceedings of the Eighth Annual Meeting of the North East Linguistic Society, pp. 189ñ97. Mass.: Amherst. Singh, Rajendra and Probal Dasgupta. 1999. ëOn so-called compoundsí. The Yearbook of South Asian Languages and Linguistics 1999, ed. by Rajendra Singh, pp. 318ñ32. New Delhi: Sage Publications. (Included in this volume as Chapter 4). Spencer, Andrew. 1991. Morphological Theory. An Introduction to Word structure in Generative Grammar. Oxford: Blackwell. Sturtevant, Edgar H. 1947. An Introduction to Linguistic Science. New Haven: Yale University Press. Vogel, Irene. 1993. ëVerbs in Italian Morphologyí. Yearbook of Morphology 1993, pp. 219ñ54. Dordrecht: Kluwer Williams, Edwin. 1981. ëOn the notions ìlexically relatedî and ìhead of a wordî í. Linguistic Inquiry. (12): 245ñ74.
284 ! Probal Dasgupta
11 The Importance of Being Ernist1 Probal Dasgupta Ode to a Grecian Ern Once upon a time, there was an etymological enterprise one believed in. It had originated in Greece and taken modern shape through the Romantics. We who sing of its passing today must allude to the Romantic Keats and his Grecian urn as we begin to consider the putative suffix ern which the prevalent paradigms of synchronic morphology would have us believe derives the adjectives in (1b) from the nouns in (1a): (1) a.
i. ii. iii. iv. b. i. ii. iii. iv.
north south east west northern southern eastern western
It is useful to coin the designation Ernism for an approach to the study of language in general, and morphology in particular, which focuses on the nature of the English element ern and seeks alternatives to a na⁄ve assent to the prevalent view that such a sequence is simply a morpheme. Why should anybody wish to withhold na⁄ve assent to that view? Because, arguably, there is a Greenbergian morphemic square not only at (1ai, 1aii, 1bi, 1bii) and (1aiii, 1aiv, 1biii, 1biv), but also at: (2) i. ii. iii. iv.
northern southern northerly southerly
The Importance of Being Ernist ! 285
Omitting the corresponding square for eastern, easterly, etc., we may take the above facts as a complete microcosm, sufficiently illustrating the crucial problems in contemporary thinking about word structure. Linguists who na⁄vely accept the standard synchronic morphologies will have to interpret the data set in (2) either by treating erly as a single element or by segmenting it into an er and a ly, both of them morphemes of some sort. Those of us who, as Ernists, are looking for an alternative, are not going to be easy to convince that either the erly or the er-plus-ly analysis achieves any real parsimony. We consider first the erly school of thought. This view claims that the suffix erly turns (1a) into a set of archiadjuncts usable in adverbial as well as adjectival positions. Some variants of this type of thinking add that this suffices to differentiate erly from the adjective derive ern, rendering any attempt at semantic specification redundant. Others, more given to semantic adventure, might on the contrary seek in the direction-emphasizing content of erly a reason for its unusual archiadjunct properties. Then there is the option of distinguishing the morphemic segments er and ly. Proponents of this view may note that ly independently exhibits both adjectival (friendly-type) and adverbial (badly-type) behaviour. They may argue that ly, when it attaches to ern, does two idiosyncratic things: it deletes the n, in the phonology, and it conveys a direction-emphasizing sense, in the semantics. These stipulations may seem a small price to pay in order to avoid postulating the otherwise unnecessary morpheme erly. How do the Erlies and Erpluslies fare when they try their arguments on us Ernists who also belong to their target audience? Do our professions of scepticism about morpheme thinking prevent us from taking sides in a debate like this? Indeed, the Ernist writing this would urge both sides to hold their horses and tarry over the further problem of the erner words: (3) i. ii. iii. iv.
northerner southerner easterner westerner
Assume, for the moment, that a certain er derives the nouns in (3) from the adjectives in (1b). Does this set pattern with other er words derived from region nouns as in (4)?
286 ! Probal Dasgupta
(4) i. ii. iii. iv.
highlander islander New Zealander Icelander
Does (3) count as patterning instead with other er words derived from relative location adjectives as in (5)? (5) i. ii. iii. iv.
insider outsider foreigner stranger
What gives us a clear warrant for distinguishing such examples from less well defined cases like (6)? (6) i. forty-niner ii. womenís-libber iii. old-timer These questions, and some of the considerations they imply, seem to me to weaken the Erpluslyís reasoning seeking our support for the apparent regularity of the derivation of northerly from northern. That reasoning depended on the idea that such a derivation reduces four stipulations to oneóa single idiosyncrasy statement, made for a particular kind of ly, coupling the phonology of n-deletion (however stated, allowing for doctrinal variation) with the semantics of emphatic directness which possibly motivates a syntax that suspends the forces distinguishing adjectives from adverbs. But we now begin to wonder if the enterprise of reducing the number of idiosyncrasy statements helps us to gain any insight into lexical matters in general. Once we doubt the value of the enterprise, the Erplusly has no cogent argument to offer. To see this point clearly, let me unpack the doubt in some detail. In set (4), there is not enough pattern. Thanks to the Scots, the term highlander is reasonably familiar to most speakers of English; possible words like midlander and lowlander are less obligatory as items of lexical knowledge. Turning to islander, this is an isolate, there being no caper, continenter, reefer, deserter, etc., to match it. One does have words like mainlander, drylander and others formed from land, but no marsher, swamper, hiller, valleyer, atoller, duner.
The Importance of Being Ernist ! 287
Does set (4) perhaps illustrate a land-based pattern, then? Can we afford to treat island as isle-land and Zea-land as containing the nonversatile (cranberry) morpheme Zea, so that er can be said to regularly attach to the land morpheme? The key word here is ëregularlyí. What can you do with a language which pronounces Englander as Englishman, Hollander as Dutchman, and Scotlander as Scotsman? Or with the fact that other languages equally stubbornly refuse to exhibit lexical regularity of the sort that this logic seeks? This was the unpacking of our Ernist pessimism about finding any neat derivational logic in the ways of words. At an immediate level, such doubt might seem to lead us Ernists to side with the Erly against the Erplusly, for it is the Erplusly who tries to base the whole story on the parsimony metric. But the Rely is also committed to the reductionist logic which the consistently disorderly, order-resisting data lead us to doubt. The Erly would have us say once, for erly, what we would otherwise have to say four times, for northerly, southerly, easterly, westerly, assuming that we should think of each lexical item as a set of statements of idiosyncratic properties, statements which any descriptively adequate account should try to pack into compact generalizations in order to make further, explanatory research possible. Now, it is these collateral assumptions that Ernism, at the conceptual level, compels us to doubt. In the case of the ern examples in (1b) which give Ernism its name, this doubt takes the simple form of an empirical question: what does the dominant non-Ernist, morpheme-seeking approach gain by postulating an ern affix? The dominant approach, relying on the usual collateral assumptions, is of course compelled to react by patiently rehearsing the story of the morpheme concept: «NSEW, donít you see, are the cardinal geographical directions. All and only these lexical items share a feature or features uniquely characterizing this set; for convenience, we choose to call it the CGD feature, for Cardinal Geographical Direction (CGD). Now ern, donít you see, has a singular property. It attaches to CGD nouns and derives adjectives from them. What we gain by postulating ern is a simple matter of the arithmetic of descriptive stipulations in the lexicon. Instead of saying four times that the adjective corresponding to north happens to be northern rather than northly, northish or northic, the adjective based on east comes out
288 ! Probal Dasgupta
as eastern rather than eastly, eastish, and so forth, the postulation of the morpheme ern enables us to say, with modest but real saving of descriptive arbitrariness, that ern and only ern adjectivizes CGD nouns. » This answer, which I believe truly represents the prevalent morphological mind-set and does not just set up the straw man I plan to knock down, ceases to compel if one has been looking at the data adduced in the discussion. Looking at the supposed affix er we can please ourselves by identifying some sort of affinity between it and a certain landed family of bases it attaches to. But it is not at the er point of lexical informationóassuming that the English lexicon has a specifiable er entityóthat we can ask and be told whether all and only landed bases (within a certain range of candidates) go in for er. There is no way for er to know, so to speak, that it is not allowed to get away with Thailander or Hollander ; it would look as though those choices are pre-empted at the bases Thailand and Holland, which associate themselves with Thai and Dutchman instead. This would mean that the putative marriage between, say, Iceland and the parsimonious entity er takes place subject to literally mutual stipulation, with er saying it can live with a land (but not, say, with a stan like Afghanistan or Pakistan) and Iceland choosing er rather than some other option. At this point in the reasoning, one fails to see what is parsimonious about such arrangements. Thus, returning to ern, we now see that you are saying something four times anyway. You have decided to avoid saying, of each CGD noun, that it stipulatively chooses ern. But you do have to set up something like an implicational statement ëif an N is CGD then it chooses erní to ensure that all CGD nouns choose it. Note that this statement cannot be a property of ern any more than attaching to some but not all landed bases could, in the case of er above, be a property of er ; for the suffix can only know and state its own options and compulsions, not those of the other party. Now, the implicational statement from the nouns to the ern can work only if each of these and only these nouns carry a lexical, idiosyncratic feature «CGD». This amounts to saying something four times. Such reasoning enables the Ernist to see that considerations of parsimony do not force us into the arms of an ern affix as an independent cluster of stipulative lexical information, standardly called an ern morpheme. We begin to ask what story to fall back on. What, we ask, is the null hypothesis compared to which the
The Importance of Being Ernist ! 289
morpheme account claimsófalsely, we begin to seeóto have come up with a more parsimonious packaging of the material? The provisional answer that we will cleave to at this stage is Lexical Integrity. There are simply words. Some of them form morphologically interesting paradigms with other words. These formations need not be seen as compromising the integrity of each word. Certain phenomena in such paradigms that the morpheme-based approach pays special attention to will also seem interesting and worth describing under the Ernistís Lexical Integrity assumption. But na⁄ve morpheme postulations will not strike the Ernist as an appropriate way to handle such phenomena. Readers who continue to feel secure in morphemic assumptions often have trouble grasping the argument in the format we have used so far. In order to help such readers to see the point of the Ernist enterprise, we now move to the topic of the unusual segmentations shown in (7) and (8): (7) PCGDs (Polar CGDs) i. nor-th ii. sou-th (8) LCGDs (Lateral CGDs) i. ea-st ii. we-st The Ernest sees the segmentation of northern into north and ern plus the machinery of the CGD feature mediating their marriage as essentially similar to those shown in (7) and (8) which most linguists would reject. Assume that CGD, a feature or feature cluster set up on semantic grounds, bifurcates further (on semantic grounds equally compelling) into the narrower features (by feature we mean feature or feature cluster) PCGD and LCGD, yielding a division of the lexical family NSEW into the Polar moiety north south and the Lateral moiety east west. Notice that now morphemic thought is nearly bound, by its logic, to flow into the unattractive channels shown in (7) and (8). If morphemic thinking must treat the family in (1b) as displaying the family totem ern, and must hold up such a totem for separate inspection and worship as a matter of theoretical necessity, then no counter-principle can stop such thinking from behaving likewise vis-à-vis the Polar subfamily in (7), with the subfamily totem th marking its PCGD status, and the Lateral subfamily in (8), with the subfamily totem st signifying its LCGD nature. Though such
290 ! Probal Dasgupta
segmentations are a bit of a reductio ad absurdum, the Ernist eye sees morpheme-seeking theories as unable to censure those who would seriously propose them. An anti-Ernist will, no doubt, urge the Ernist to remember the logic of the Greenbergian morphemic square, which might seem to some readers to preclude (7) and (8). But the square merely points to prototypical examples of morpheme postulation and cannot be used to legislate against otherwise motivated morpheme cuts. The absence of a morphemic square pattern seems not to prevent us from imagining that we see the appropriate segmentations in: (9) steal-th (10) be-neath (11) with-out There is clear independent motivation, of a type normally accepted in morphemic theories, for the th morpheme in the Polar Cardinal Geographical Direction subfamily. The motivation comes from the phonological alternation of voiceless th in the nouns with voiced th in the ern adjectivesóan alternation not found inó (12)
i. steal-th ii. steal-th-y (also voiceless th)
(13)
i. leng-th ii. leng-th-en (also voiceless th)
and thus requiring comment. If that type of motivation was good enough for setting up a sume morpheme in consume, presume, resume for the sake of a generalization about sume, sumption, sumptive, in the absence of any semantically seaworthy morphemic squares in that neighbourhood, then surely morphemic theory should be pleased with at least (7) on similar grounds, and therefore with (8) as an inevitable residue. In fact, th in (7) is an even better qualified candidate than sume, since th can be associated with a meaning as the subfamily-character-bearer of the PCGD subfamily, whereas sume could not be assigned any interpretation. Lack of morpheme-theoretic credentials, unfortunately for the theory, is not the problem at (7) and (8), then. On the contrary, the credentials, especially for (7), are quite impressive. But the segmentation fails to convince. The Ernist would urge adherents of morphemebased and other morpheme-oriented theories (including some that
The Importance of Being Ernist ! 291
move in the word-based direction) to see the unconvincing (7) and (8) as similar to the apparently more plausible morpheme cut right before ern in (1b).
A Metatheory for Ernism Given a specific formulation, readers are free to associate it with various interpretations. Similarly, the Ernist considerations presented in Section 1 become tighter and better directed when attached to a particular context. The present section offers one possible contextualization. The study of natural language has had a discontinuous history. Each major wave of inquiry has proposed a specific set of prototypically interesting item-source relations and embedded that set in a particular derivational methodology of deriving such items from such sources. Some of these waves have featured even more ambitious proposals to spell out systematic linkages between distinct waves with their different methods of linguistic study; but these proposals have never had any teeth; all workers in the field have reconciled themselves to the impracticality of building such bridges over the gaps between methodologies. One form the discontinuity has taken is the temporal gap between, say, a nineteenth-century paradigm of historical and comparative philology and the synchronic enterprise of the twentieth century. Another, more unsettling form of the discontinuity appears in the schools-of-thought gaps, which became apparent already in the structuralist decades, but grew in scale and severity when structuralism gave way to a field bifurcated into a theory-focused generative microlinguistics and a data-driven set of macrolinguistic enterprises, from psycholinguistics and stylistics to sociolinguistics and language planning studies, with terms of reference often argued to be vastly different from the microlinguistic core of language study. The gaps mentioned above appear at first glance to be strikingly different in extent, type, and tractability. One assumes, for instance, that there is no intractable gap between the historical legacy of the nineteenth century and contemporary work, for current apparatuses can readily reconstruct the rational kernel of what the nineteenth century scholars had come up with. Contrary to this widespread
292 ! Probal Dasgupta
impression, it is argued here that all these gaps are most usefully seen as being of the same typeóon a certain construal, which this section spells outóand as equally intractable at the level of intermethodical dialogue or reconciliation. It is further suggested that the sooner we accept the impossibility of confronting and bridging these gaps, the better for the actual prospects of dealing with the overall problem of discontinuities in the study of natural language, paradoxical as this may sound. Our reading of the problem runs as follows: Any paradigm P seeking to systematize a certain type of knowledge K about natural language L is going to identify a prototypically interesting set of items I and focus on them. The paradigms that have prevailed in linguistics until recent years have had properties leading to systematic divergence and unbridgeable discontinuities. The key property has been the tendency to propose illuminating descriptions of I which illuminate entirely or mostly in terms of a derivational or quasiderivational mapping D associating I with a set of sources S. Typically, this mapping has featured a set of item mappings of the form d(s) = i, the derivation d associates item i with sources s. The first and archetypal mapping D was the etymological D which drove comparative and historical studies. Subsequent D mappings until recently reproduced the general architecture of the etymological D and may be usefully described as synchronic etymological enterprises. Only in recent years have linguists started outgrowing that type of D. This outgrowing process has come far enough to make it worth our while, today, to consider post-etymological metatheories of linguisticsólike the one proposed hereóand to take stock of what the etymological metatheories have achieved and ask how we can preserve that past without letting that effort destroy our present. We shall do a bit of that stock-taking here as part of the explication of our specific metatheory. Thus, the following is not offered as an empirical history, but as a classificatory idealization of enterprises which reflects the logic of our metatheory. Our use of the historical mode is an expository necessity in such an exercise. Historical linguisticsóto prescind from much nineteenth century variation and conceptual changeóoffered an Etymology of Words. That enterprise comes to its crystallization in the work of Saussure. He realizes that the neogrammarian project, however completely implemented, is in principle unable to bring all linguistic data under the scope of law-like regularities, and cannot
The Importance of Being Ernist ! 293
intelligibly systematize the irregular residue. Hence his ëarbitrariness of the linguistic signí, of course, hence also, and more importantly, his analysis of elements in terms of four dichotomies foregrounding what is Immanent-Present and backgrounding what is Transcendent-Present: the signifier (I, for Immanent) and the signified (T, for Transcendent); the form, I, and the substance, T; the synchronic, I, and the diachronic, T; the syntagmatic, I, and the paradigmatic, T. This framework encourages the development of a connected account of forms in synchronic systems, since such an account can rely entirely on the etymological style; one then ends up leaving the residual material to other disciplines oriented to the Transcendent poles of the dichotomies. In Saussure, the historical Etymology of Words self-destructs and merges into various structuralist projects which all regard diachronic linguistics as an optional other discipline (this was the self-destructive element in Saussureís moves). On our construal, this self-destruction arises from Saussureís attempt to push the neogrammarian logic to the point of demanding total accountability. Historical etymology as such must relegate plenty of material to the domain of ëanalogyí, holding synchronic systems responsible for creating new material. Therefore a historical linguistics trying to set up a systematic business is forced to pass the buck to an ëarbitraryí and ëcreativeí synchronic system responsible for the elements we find no decent historical etymon for. But no structuralism can offer synchronic sources for words. Since the linguistís etymological drive remained intact while the historical wing of the enterprise became first optional and then marginal, the derivational impulse sought new objects, especially in the material that Saussure had relegated to parole. Late structuralism bifurcated into an etymology of sentences in language systems (a generative grammar) and a differentiated set of etymologies of acts of speaking, of language use (a macrolinguistics, spread over sociolinguistics, psycholinguistics, speech act theory, and so forth). Both branches of the new project began by seeking frankly derivational accounts. As this venture failed across the board, post-etymological methods began to appear, changing the terms of the problem. The post-Saussurean problem had been initially posed, by one and all, as one of reducing the domain of composite or ëfree expressionsí (comprising what he called a syntagmatic chain of simple signs) to new types of lawfulness. For instance, syntacticians sought
294 ! Probal Dasgupta
to show that speakers are in fact not simply free to choose to combine the signs every which way, but must obey combinatoric regularities, arbitrarily distributed over the body of languages in ways that we are only beginning to understand. Sociolinguists argued that, given a field of variability, exhibiting so-called ëfree variationí, no speaker is really given a free choice; the strings attached tie each variant to specific register or subcommunity or style affiliations which end up as part of the meaning of that variant vis-à-vis its rivals. Developmental psycholinguists proposed to derive child phonology from adult phonology, or to derive adult syntactic production from adult semantic intention. Surely the most influential derivational mappings of that period were those in syntax, from deep to surface structure; and in phonology, from phonological to phonetic representation. What one might call the general problem of the ëcompaction of free expressionsí is still with us. But, with the growth and maturation of these early approaches to the problem, we have all learned from the failure of a na⁄ve derivationism that seeks a privileged set of basic free expressions from which all other variants arise or to which they all crucially allude. It is this common lesson we have learned, cutting across subtraditions of post-Saussurean linguistics, that seems to provide a basis for a potentially new context now. Without falsely promising any total reconciliation between, say, sociolinguistics and generative grammar, we can usefully note certain parallels in their development. Early generative grammar assumed it would find a compact set of underlying structures giving rise to a broad spectrum of alternating superficial ones. Early sociolinguistics hoped to show that a compact set of High or Formal carriers of social prestige would help make sense of a broad spectrum of fluid variability among prestige-seeking members of various social strata, speaking in moments of various degrees of relaxed distance from the pursuit of prestige. Both of the enterprises ended up settling for a much less compact reference set. Generative grammarians accept several types and modes of underlying structure (or equivalents thereof) in syntax and phonology. Sociolinguists also have relativized the high-low or formal-informal dichotomy to diverse domains and various kinds of intercommunity transaction which affect the dynamics within a community; this leaves them without any unique, compact set of High utterances emitted by a unique set of Beautiful People at their most Formal moments controlling all else in the community.
The Importance of Being Ernist ! 295
Both of the enterprises, then, accept the impossibility of localizing any formal base postulated as the rigid reference point for fluid and variable surfaces. This ends the initial part of the post-Saussurean effort. What does it begin? Here we go back to the empirical material of Section 1. To the extent that linguists abandon the all-out effort to achieve total accountability of morphic material in terms of etymological enterprises, it becomes possible to refrain from trying to reduce northern, etc., to north, etc., plus a separate ëlinguistic signí ern. Here, in section 2, we are examining the larger context of the withdrawal from that total accountability imperative. Once one realizes that so-called superficial or derivable forms are not necessarily to be interpreted relative to some exclusive and unique reference set at the heart of language, one regards these exercises of setting up reference sets (and mapping relations D(S) = I pointing from the reference set to the items of analytical interest) as optional. They depend on the various I-sets of interest in different types of linguistic analysis. An enterprise that becomes optional shrinks back to the cases where it is of genuine use, the way historical linguistics rolled back to its antiquarian dimensions after synchronic linguistics had opened up the directly askable questions of living languages. Likewise, once the etymological impulse in general is demobilized, one may reasonably expect it to go back to the normal behaviour of civilian life. This means, for words like northern, that one stops being tempted in such cases to cut the word up and try to reduce the paradigmatic facts of families (1a) and (1b) to the imagined syntagmatic properties of some construct ern exercising its selective affinities. One permits the paradigmatic fields to keep their specificity. Now we begin to answer the question of what begins when the late phase of post-Saussurean work seriously accepts the impossibility of localizing any rigid formal reference set to which all fluid, variable surfaces refer and signals the end of the first idealization. We have made the point, with a specific example in mind, that Saussureís transcendent presence of the paradigmatic stops being backgrounded relative to the immanent presence, the syntagmatic. This is how the rest of our answer goes. The alternatives to Ernism, such as Erly and Erplusly, implicitly deny the specificity of the signified. They seek to do as much of the work as possible at the level of the signifier, if necessary by slicing
296 ! Probal Dasgupta
it into bits and zeroes, so that practically nothing needs to be said about the suspicious and mysterious signified. In the case at hand, morpheme-oriented theories would have us attribute the relevant semantic properties to the logical syntax of affix elements, thus reducing the apparent semantic field (1b), say, to the allegedly composite signifier consisting of (1a) plus «ern ». It does not bother the morpheme-oriented that such moves falsely extend the domain of speaker freedom from truly free expressions (syntactic combinations of words) to ëfree combinations of morphemes in wordsí, thus undermining what little sense we have been able to make of the speakerís combinatorial freedom. We are proposing that this extension be repealed. This means restoring to the signified, another of Saussureís transcendent presences, the status of a possible object of direct inquiry that can be studied without having to always look for signifier-level mirroring of properties of the signified. And we note, on the basis of the data of Section 1, that the opening up of the paradigmatic gives us appropriate, non-syntagmatic machinery with which to actually investigate the signified rigorously. For a larger case study making the same point, in a metatheoretical context partly similar to what is offered here, (see Chapter 6 of Dasgupta 1989). Morpheme-oriented readers who have been lamenting our apparent desire to jettison the legacy of morhological work should be, if not gratified, at least moved away from their na⁄ve lamentation by our next move. We now proceed to salvage another of Saussureís transcendent presences, this time that of the diachronic, from its backgrounded place outside his synchronic structure. What there is to say about word-internal morphological structure in the quasireconstructive mode of standard morphemics should, we propose, be presented directly as diachronic reconstructions, which need not be exiled from the normal study of language. To the extent that speakers have intuitions about, say, some affixes being productive and others dead, we should register these as facts about the psychological reality of diachronic relations between words. This move puts etymology proper back in its rightful placeóthe history of language, now seen as a reality (in dialogue with the more «purely» synchronic realities) in the mental representations speakers have of what they know of their language. Questions, of course, will arise in such a context about apparent or ëfolkí etymologies versus the real ones which convince experts familiar with a larger body of evidence. When such questions are actually raised and discussedó
The Importance of Being Ernist ! 297
rather than suppressed, as tends to happen at presentóit will perhaps emerge that these problems interact fruitfully with wider considerations about popular or folk images of normal social history versus the ërealí versions of history rigorously put together by professionals. To illustrate again with respect to our ern words, it is an etymological fact about modern that it belongs to the Romance stratum of the English lexicon and nominalizes as modernity. Equally etymologically and historically, northern etc., are native items, intolerant of formations like northernity. These facts are, if one wishes, part of the present history available to the native speakerís mental representations of Englishóhowever they may look on the bigger chessboard of the historical linguistics aficionadoís games. Such facts, even if their status as conceptualized here must be drastically modified in an adequate metatheory, are to be seen as interacting with the ways of the paradigmatic to produce some of the more delicate semantic effects. In this sense, the associative relations we are talking about, which are opened up in several distinct ways by the moves made here, make it possible to envisage concrete interactions between the various transcendent terms which Saussureís dichotomies had exiled from pure linguistics. We turn, finally, to the form-substance dichotomy. The logic of our argument encourages us to plead for a serious revival of interest in substance. Here one might feel like saying that history got there before our argument did. For serious inquiry in phonetics and semantics has been going strong and attracting the interest of pure linguists for quite some time now. Of course we agree with that sentiment. That goes also for much else proposed here; if we do not spend the time to document in detail the claim that these moves merely register and systematize what many thinkers are doing already, this is only because such documentation would distract attention from the unity of these tendencies that we seek to demonstrate by constructing a visibly single argument here. But there is more to the revival of substance than simply attending to phonetics and semantics again. When Saussure conceptualized form vis-à-vis substance, he did so in a way that began by locating a certain type of form-substance boundary at the interface between language and the neighbouring systemsóphonetics at one end and cognition at the other. But it went on to distribute another kind of form-substance boundary over the
298 ! Probal Dasgupta
whole linguistic system, at every internal interface where entities from one type of game are regrouped and recharacterized for another type. This became clear when Saussureís promises were cashed in the form of structuralist systems all over the world. Thus, if phones are substance for phonemic form, then phonemes constitute morphs as substance for morphemic form, in the American implementation of structuralism. On such a reading, Saussureís dictum that language is form, not substance, leads to a privileging of the ëhigherí level over the ëlowerí in all inter-level transactions. Reversing the bias of this dictum, while continuing to cherish the positive achievements of the old framework, is going to mean listening to the ëlowerí levels when they talk back. Thus, it is a feature of ësemantic substanceí, independent of the forms imposed in the linguistic organization of the lexical system, that the way one thinks and talks about CGDs does lead to adjectives pertaining to the composite directions, as in (14), but not to conceivable adjectives which, if they existed, would denote the polar axis and the lateral axis, as in (15): (14) a. b. c. d. (15) a. b.
north-western north-eastern south-western south-eastern *north-southern *east-western
Also semantic-substantive are such facts as the primacy of the polar over the lateral axis, illustrated at (16) (versus [14] given above): (16) a. b. c. d.
*west-northern *west-southern *east-northern *east-southern
Listening to substance means registering the effects of substance on the body of form without immediately trying to find autonomous, form-only-based reasons for these effects. It does not mean a newfound enthusiasm for finding substantive causes for all formal phenomena and thus simply trying to reverse somebody elseís reductionist arrow. It does mean refraining from seeking a total accountability that demands that exactly one kind of accounting should be able to paper over all gaps.
The Importance of Being Ernist ! 299
For instance, consider the fact that the northern-type words, as well as upper, nether, inner, outer, fore, hind, top, bottom, left, and right, sponsor yet another interesting word family: (17) a. b. c. d.
northernmost southernmost uppermost nethermost, etc.
The narrative of this family has some form to it. There are no holes in the pattern. Possibly ingenious semanticists will be able to produce a feature bundle uniquely and interestingly specifying all and only these 10 words. But what interests the Ernist is the fact that there is also some substance to family (17). For one thing, there is the quirky member utmost. For another, and this is the type of fact that becomes interesting and reportable only when substance makes a comeback, note that there is no family (18): (18) a. b. c. d.
*northernmore *southernmore *leftmore *rightmore, etc.
But, if form was all, there most certainly should have been an (18). It is a fact of semantic, verging on pragmatic, substance that cardinal directions and their ëfamily resemblantsí like left and right denote rays, for which normal pointing and the marking of extremes in (17) both make a kind of practical sense that deserves the social enshrinement conferred by wordhood, whereas the relative distance marking of (18) is incongruous.
Immediate Contextual Specifications The treatment of the ern facts and the metatheory presented here are not, it should be emphasized again, anything original. They merely put together, in one package and in manifesto format, ideas from the lexicalist tradition in the particular shape it took in the early years of the journal Linguistic Analysis, as the ideas look when they combine with the morphological and other work of Rajendra Singh and Alan Ford, to which I have had access both as a reader and, even more rewardingly, through personal communication. What is
300 ! Probal Dasgupta
new about this particular manifesto written in response to Ford and Singhís work (Ford and Singh 1992), is our attempt to show that Ernist thinking, as we call it here, on the basis of possibly new examples (given in Section 1), brings to a head the natural tendencies in the growth of the field. It remains to be seen if this attempt will make it possible for those doctrinally ill-disposed towards theories that regard derivations as optional to engage in a serious, non-hostile debate with some version of these ideas. Notice that the programme presented here is compatible with several types of development and implementation. In particular, it does not lead us to reject the use of derivations, or traces, or what have you. It enables us to take a certain kind of view of theories and the gaps between theories. The point is to see where the adoption of such a view will lead us. That is an empirical question.
Note 1. First published in Linguistic Analysis. 25(12): 121ñ36. Reprinted with permission.
References Dasgupta, Probal. 1989. Projective Syntax: Theory and Applications. Pune: Deccan College Postgraduate and Research Institute. Ford, Alan and Rajendra Singh. 1992. ëPropJdeutique morphologiqueí. Folia Linguistica. 25: 549ñ75.
A Perfect Strategy for Latin ! 301
12 A Perfect Strategy for Latin Byron W. Bender Introduction The ëmorphological strategyí has been proposed by Ford, Singh and Martohardjono (1997: 1) as the sole formalism necessary for describing ëany morphological relationship between two words of a languageí. They maintain that none of the other distinctions, concepts, or levels of representation that have sometimes been used in the study of morphology should be incorporated into the formal theory of grammar. Gone, under their proposed constraints, would be distinctions such as inflection vs word-formation, and concepts such as the paradigm, among others. They say that, ëthe burden of proof is, clearly, on those who want to ... introduce additional devices to account for the factsí (Ford, Singh and Martohardjono 1997: 3ñ4).1 In another paper (Bender 2000), I have attempted to show that the paradigm is just such a device, while also presupposing a dual role for morphology in relation to the lexicon, one that describes the relationship between different words that resemble each other formally and semantically, and one that describes the relationship between different inflections of one and the same word.2 It is in connection with this latter role that the paradigm exists as an inherent part of language design. Paradigms and inflection are part and parcel of the same phenomenon in those languages where they occur, such as Hungarian, for example, one of the languages Carstairs (1987) uses to demonstrate convincingly the existence of paradigms through his Paradigm Economy Principle, or such as Latin, the focus of this paper.3 Here, I explore further and in some detail the role and exact nature of morphological strategies in Latin verb inflection. I will expand a bit on the assertion that paradigms are an inherent part
302 ! Byron W. Bender
of language design. Then I review the special strategies known as ëparadigmatic strategiesí proposed in Bender (2000), and go on to explore how such strategies might best deal with the complex phenomena presented by Latin inflections, including especially the perfect-nonperfect distinction, one deeply embedded in the verb root. This study is holistic and quantitative in that it takes all of Latin finite verb inflection for its scope, and gives approximate numbers of members for each of the subparadigms that are noted.
Paradigms, An Inherent Part of Language Design Why should it be that some languages have inflection and others do not is a good question, but not one on which I have anything to offer here. My interest in this section is on the nature of inflection in those languages that have it. We can recognize inflection as occurring when a given word in the lexicon is given different forms, depending on the morphosyntactic features it bears, and when parallel treatment is accorded other members of the same part of speech. The semantics of the features must be constant and predictable for each word inflected. When this occurs, a paradigm exists simply by virtue of the morphosyntactic features involved. If the features come from more than one morphosyntactic category, the paradigm will have both rows and columns. To give a simple illustration, if verbs are inflected for person and number of their subject, the minimal paradigm of Table 12.1 will be found. The paradigm exists simply by virtue of the set of the features, drawn from two categories, person and number.4 The cells must be populated by different forms, but their exact shape is irrelevant to its existence. Table 12.2 gives the actual forms for a Latin 1st conjugation verb in the present active indicative. Table 12.1: Paradigm Implied by Certain Features of Person and Number Nonplural Speaker Hearer Neither
Plural
A Perfect Strategy for Latin ! 303 Table 12.2: Actual Forms for a Latin 1st Conjugation Verb Nonplural Speaker Hearer Neither
Plural
amô amâs amat
amâmus amâtis amant
Table 12.3: Possible and Existing Combinations of Certain Latin Verb Endings
Future Present Imperfect
1st
*
*
*
4th
*
*
*
âbô ô âbam
âbô ô iêbam
iam ô âbam
iam ô iêbam
iam iô iêbam
iam iô âbam
âbô iô iêbam
âbô iô âbam
If that were all that could be said generally about paradigms, then the assertion would be true that they are nothing more than ëan epiphenomenon of the morphosyntactic feature system and therefore of no intrinsic interestí (Spencer 1991: 224). But the most compelling thing that can be said of them in addition is the following: they are capable of superimposing purely morphological classesóconjugational or declensionalóupon their part of speech, and when they do, they come in sets, with the sets optimally unmixed.5 This can be illustrated by looking at just two of the four sets imposed on Latin verbs, the 1st and 4th conjugations.6 The conjugations differ primarily in their present active indicative forms, but also in their future and imperfect forms as well. Table 12.3 gives only the first person singular endings for each of the three tenses. Note that of the eight possible combinations of these endings that could occur, if they were free to intermix, only two actually do. There is a 1st conjugation pattern and a 4th conjugation pattern, and the two do not intermix. This sort of evidence is the strongest I know of for the existence of paradigms. Spencer concludes that,ëperhaps the most interesting implication of work such as this is that, if it captures a linguistic universal, and if it can be shown that that universal must form part of the human language faculty, then it is difficult to see how linguistic theory will be able to do without the notion of ìparadigmî (in one form or another). Carstairsí work therefore presents a challenge to those who would maintain that the paradigm is a mere epiphenomenon, with no autonomous role in Universal Grammarí (Spencer 1991: 229).
304 ! Byron W. Bender
Paradigmatic Strategies In Bender (2000), I refer to the strategies introduced in Ford, Singh and Martohardjono (1997) as ëreciprocal strategiesí, the sort that relate pairs of forms. (1)
/Xô/ 1SG /Xâtis/ 2PL ↔ cf. amô : amâtis, portô : portâtis, etc.
A total of 15 would be required to relate every possible pair combination of a simple six-member paradigm like that in Table 12.2, and a total of 4,005 such reciprocal strategies would be required for the 90 inflections of a regular Latin verb.7 To decrease this number when dealing with inflection, I propose (in Bender 2000: 22) ëparadigmatic strategiesí in their stead. Example (2) replaces not only (1) but also the other 14 reciprocal strategies needed to relate any of the six forms in Table 12.2 to any other. (2)
PARADIGMATIC STRATEGY
conj 1: nonperf pres act ind sg. 1. Xô 2. Xâs * 3. Xat
pl. Xâmus Xâtis Xant
The asterisk at the centre is intended as a label for paradigmatic strategies, a reminder of their multifaceted, radial nature. Example (2) contains, in addition to the two poles of (1), the other four poles that would be involved in the total of 15 different reciprocal strategies that it replaces.8 If that were all that paradigmatic strategies did, display in more compact form the same content as reciprocal strategies, they would be nothing more than notational variants, and undeserving of any theoretical status. But there is an important distinction. The 15 reciprocal strategies relating the forms of the paradigm in Table 12.2 do not have a constant value for X, which will vary according to what the members of each pair have in common. Compare (1), repeated here in modified form, with (3) and (4). (1) /Xô/ 1SG amô
↔
/Xâtis/ 2PL amâtis
(X = am, port, etc.)
A Perfect Strategy for Latin ! 305
(3) /Xs/ 2SG amâs (4) /Xt/ 3SG amat
↔ /Xtis/ 2PL amâtis
(X = amâ, portâ, etc.)
↔ /Xatis/ 2PL amaatis
(X = ama, porta, etc.)9
As pairs considered in isolation from any other inflections, there is no other principled basis for determining the value of X. But when the same forms are considered as co-members of a paradigm, X can be defined paradigm-wide as being the maximum form all members of the paradigm have in common. This would seem to reflect advantages in performance; one need not pause to adjust the value of X as one moves about the paradigm mentally. And, of some possible theoretical interest, when long vowels are treated as integral wholes (see Note 9), X is coterminous with the traditional root. The six paradigmatic strategies that lie at the heart of 1st conjugation inflection are given in (5ñ10).10 Each of the six contains 15 poles, or a total of 90, one for each of the 90 inflections. Each should be considered as occupying (somehow) four dimensions, those of voice, aspect, tense, and mood. Each is dedicated to a single personnumber combination, and none includes the strategies necessary for changing these combinations, for moving from one of the six to another. This can be done at any of the 15 cells, by tying it ëcrosssectionallyí along two additional dimensions (person and number) paradigmatic strategies (a total of 15, one for each cell) like those in (2). The additional 14 are not given here, as they contain no new poles.11 In a sense, they (together with [2]) consist only of the 15 sets of interconnections necessary to make any switch of person and number, simultaneously. This is admittedly a complicated maze, as we attempt to represent visually the six dimensions involved. In Bender (2000), I refer to the six strategies in (5ñ10) as ëcorridorí strategies, and to the 15 other strategies that connect them as ëcornerí strategiesóthis may or may not be of help to some readers. We can think of the strategies as accomplishing the switches in wordform that accompany switches in morphosyntactic features: ±perfect, ±subjunctive, ±future, ±past (imperfect), and ±passive, all done along the corridors, while ±hearer, ±speaker, and ±plural are all done simultaneously in any of the corners. In Bender (2000), I give reasons for treating the latter as simultaneous. If it is any consolation, these six corridor strategies (with 15 poles each) and the 15 corner strategies12 (with the same poles, six
306 ! Byron W. Bender
each) are able to accomplish what it would take 4,005 reciprocal strategies to accomplish. But we cannot rest yet. Thus far we have considered only one of the four conjugations! (5)
PARADIGMATIC STRATEGY,
CONJ
1
1SG ACTIVE
FUT PRES PAST
PASSIVE
FUT PRES PAST
(6)
PARADIGMATIC STRATEGY,
CONJ
1 FUT PRES PAST
PASSIVE
FUT PRES PAST
(7)
PARADIGMATIC STRATEGY,
CONJ
1 FUT PRES PAST
PASSIVE
FUT PRES PAST
(8)
PARADIGMATIC STRATEGY,
CONJ
1 FUT PRES PAST
PASSIVE
+PERF
IND
SUBJ
IND
SUBJ
Xâbô Xô Xâbam Xâbor Xor Xâbar
Ü Xem Xârem Ü Xer Xârer
Xíerô Xí˘ Xíeram
Ü Xíerim Xíissem
1ST
CONJUGATION,
FUT PRES PAST
*
2SG +PERF
IND
SUBJ
IND
SUBJ
Xâbis Xâs Xâbâs Xâberis Xâris Xâbâris
Ü Xês Xârês Ü Xêris Xârêris
Xíeris Xíist˘ Xíerâs
Ü Xíer˘s Xíissês
1ST
CONJUGATION,
*
3SG +PERF
IND
SUBJ
IND
SUBJ
Xâbit Xat Xâbat Xâbitur Xâtur Xâbâtur
Ü Xet Xâret Ü Xêtur Xârêtur
Xíerit Xíit Xíerat
Ü ß Xíerit Xíisset
1ST
CONJUGATION,
*
1PL +PERF
PERF
1PL ACTIVE
1SG
PERF
3SG ACTIVE
CONJUGATION,
PERF
2SG ACTIVE
1ST PERF
IND
SUBJ
IND
SUBJ
Xâbimus Xâmus Xâbâmus Xâbimur Xâmur Xâbâmur
Ü Xêmus Xârêmus Ü Xêmur Xârêmur
Xíerimus Xíimus Xíerâmus
Ü Xíer˘mus Xíissêmus *
A Perfect Strategy for Latin ! 307 (9)
PARADIGMATIC STRATEGY,
CONJ
1 FUT PRES PAST
PASSIVE
FUT PRES PAST
+PERF SUBJ
IND
SUBJ
Xâbitis Xâtis Xâbâtis Xâbimin˘ Xâmin˘ Xâbâmin˘
Ü Xêtis Xârêtis Ü Xêmin˘ Xârêmin˘
Xíeritis Xíistis Xíerâtis
Ü Xíêr˘tis Xíissêtis
PARADIGMATIC STRATEGY,
CONJ
1
1ST
CONJUGATION,
FUT PRES PAST FUT PRES PAST
*
3PL +PERF
PERF
3PL
PASSIVE
2PL
IND
(10)
ACTIVE
CONJUGATION,
PERF
2PL ACTIVE
1ST
IND
SUBJ
IND
SUBJ
Xâbunt Xant Xâbant Xâbuntur Xantur Xâbantur
Ü Xent Xârent Ü Xentur Xârentur
Xíerint Xíêrunt Xíerant
Ü ß Xíerint Xíissent *
Paradigm Mixture of the 3rd and 4th Conjugations The strategies for the other conjugations are given in the appendices, the 2nd conjugation in Appendix 1, and the 3rd and 4th in Appendix 2. The latter two are presented together because they provide us with an instance of paradigm mixture, one that resulted in what are now referred to as, ë3rd conjugation verbs in -iôíóroughly a dozen common verbs (and their prefixal compounds) that were originally members of the 4th conjugation, but which through morphological (analogical) change have crossed over partially to the 3rd conjugation paradigm, especially in that least-marked sector, the (nonperfect) present active indicative. They will be referred to here simply as ëmixedí and labelled as (3a). That the mixture occurred, and how it occurred, is of some relevance to our interest both in paradigms and in strategies. Paradigms must exist in order for paradigm mixture to occur, and if strategies are at the heart of morphology, they too should shed some light on and bear some relation to our understanding of paradigm mixture. Presenting the strategies of these two conjugations side-by-side brings out the similarities that must have helped create confusion at certain points.
308 ! Byron W. Bender
The juxtaposition of the strategies of conjugations (3) and (4) in Appendices 2 and 3 brings out their similarities and differencesó the latter being often simply the presence (CONJ 4) or absence (CONJ 3) of an initial i in their poles.13 In Appendix 2, three paradigmatic strategies are conflated in each table, with 3rd conjugation above, 4th below, and mixed (3a) in between. Those table cells in which mixture has occurred are shaded. These are the cells in which (3a) is identical in form to 3 rather than to 4. The one cell in which the two conjugations were already identical, and which thus probably constituted the beachhead for the analogizing that led to the mixture elsewhere, is the 3S nonperfect present active indicative, enclosed by framing in the third-person singular table of Appendix 1, and in Tables 12.4 and 12.5.14 This cell was, in a sense, mixed from the beginning. Because the third-person singular it was an equally valid pole in the strategies of both 3rd and 4th conjugations (see Table 12.4), it could lead to either result when it was part of the strategizing. When a verb was known only in its 3S form, that form could lead to either 3rd or 4th conjugation forms in the other person-number combinations. Why eventual paradigm mixture occurred only in certain of these combinations, and only with certain verbs and not with others, are questions of considerable interest, but Table 12.4: The Ambiguity of 3s Verbforms 3rd Conjugation
Speaker Hearer Neither
Nonplural
Plural
Xô Xis Xit
Ximus Xitis Xunt
4th Conjugation Nonplural Plural 1 2 3
Speaker Hearer Neither
Xiô X˘s Xit
X˘mus X˘tis Xiunt
Table 12.5: Paradigm Mixture between Conjugations 3 and 4 (in shaded areas, where 3a = 3)29 Nonperfect Present Indicative Active 1s 2s 3S 1P 2P 3P
3 ô is it imus itis unt
3 A =3 = = = =
3A iô is it imus itis iunt
Passive 3 A =4 = =
=
4 iô ˘s it ˘mus ˘tis iunt
3 3 A =3 3 A 3 A =4 or ior = eris = eris itur = itur imur = imur imin˘ = imin˘ untur iuntur =
4 ior ˘ris ˘tur ˘mur ˘min˘ iuntur
A Perfect Strategy for Latin ! 309
beyond the scope of this paper, except to say that the answers seem to be in large part prosodic.15 The associated ëcornerí strategies shown in Table 12.5 give another perspective on where mixture occurred and where it did not in the nonperfect present active indicative. Note in Table 12.5 the match in what was mixed between the passive and the active. This fact alone calls for some explanation, especially when it is also noted that the other sector in which mixture occurred also pairs active and passive completelyóthe past (imperfect) subjunctive. Why should the paradigm mixture identified in Appendix 2, involving at most four of the 15 cells in each table as it does, always include both members of a passive-active pair? Why are both passive and active cells of a given aspect-tense-mood combination always shaded?16 I would offer the following explanation. A certain syntagmatic iconicity can be observed in Latin verbs, with aspect being closest to the core (as we shall see in the following section); person, number, and voice most peripheral; and tense and mood intermediate. This refers both to the location of the exponents for each of these categories within the verbform, and semantically to whether the categories are peculiar to the verb, or whether they relate to its arguments as wellóthereby the iconicity.17 Note that voice is, if anything, even more peripheral than person and number, so much so that there exist the reciprocal strategies given in Table 12.6 for switching between active and passive. These can be gleaned from comparing the active and passive counterparts in the paradigmatic strategies of (5ñ10) and the appendices. Those for the plural are slightly more straightforward than those for the singular,18 each of which must be qualified to a certain extent: complementary strategies are necessary in 1s; one must be aware that short i before s alternates with e before r in 2s; in 3S, the final t of the active is a syllable coda that shortens long vowels in its nucleus; in the tur of the passive, this same t becomes the onset of a new syllable and no longer shortens the preceding vowel. The problem then is in determining which vowels are inherently long and which are inherently short as one passivizes. Inherently short vowels remain short in the passive; others become long. All eís and aís turn out to be long; only certain iís are short. I conclude that the strategies of Table 12.6 are not only simple and straightforward, but also that they must have been constantly employed to obtain the proper passive forms from the active,19 and vice versa. When one became mixed, the other did too. Passive was a peripheral
310 ! Byron W. Bender Table 12.6: Reciprocal Strategies for Active 1S 2S 3S 1P 2P 3P
ô] m] s] t] mus] tis] nt]
± passive30
Passive or] ↔ r] (in complementary distribution) ↔ ris] (i / _s] alternates with e / _ris] [short i only]) ↔ ↔ (:)tur] (vowels shorten before t, lengthen before tur)31 ↔ mur] (could be expressed simply as s] ↔ r]) ↔ min˘] ↔ ntur] (could be expressed simply as] ↔ ur])
category that was responsive to syntactic considerations and must have often been switched independently of the remainder of the verbform simply by resorting to the very general and powerful reciprocal strategies of Table 12.6. With reference to endnote 31 of Table 12.6, the i of the third-person singular (present active indicative) of 3a verbs also fails to lengthen in the passive. If these verbs were in fact originally members of the 4th conjugation, this means that this vowel was once inherently long, but has been reanalyzed as inherently short, like that of original 3rd conjugation verbs. The other obvious question that needs to be asked is why, of all places, should mixture have occurred in the past (imperfect) subjunctiveóagain, both active and passive? The reasons for its occurring in the least-marked, most frequent forms of the nonperfect nonfuture nonpast nonsubjunctiveóthat is, the ëpresent indicativeíóhave already been discussed, but why should the only other incidence have been in the past (imperfect) subjunctive? I note that the choice here is between long and short i preceding an r, which consonantóas often happens in Latin phonologyólowers the short i to e. But whether (and if so, how) this might have contributed to the confusion is not clear to me at this point. I turn now to the main focus of this paper, the strategizing involved with the perfect inflections.
The Perfect Aspect Readers who have taken the time to study in some detail the paradigmatic strategies that are presented for Latin in (5ñ10) and the appendices may have noticed two not unrelated facts: (a) the perfect strategies are identical for all conjugations, and (b) the morphological constant for all perfect poles is Xí rather than Xóthus
A Perfect Strategy for Latin ! 311
opening the possibility that X may not in fact be constant across the entire paradigm. Those who recall that the initial schema of Ford, Singh and Martohardjono (1997: 1) is of the form /X/a ↔ /Xí/b would be correct in assuming that what we are dealing with here is in effect a strategy within a strategy. It was noted in the section on Paradigmatic Stratigies that the X constant in Latin verbal paradigmatic strategies coincides with what has been recognized traditionally as the root; use of Xí is thus a means of dealing with root mutations involved in the formation of the perfect aspect, some of them quite extensive.20 A look at the data will show what has driven us to this extreme.
Morphological Processes Involved in Perfect Formation X and Xí stand in a variety of relations to each other, each of which can be illustrated with 3rd conjugation verbs.
Reduplication: Regular examples of the partial reduplication that is involved have values for X and Xí respectively such as pend, pepend ëweigh, hang downí and tend, tetend ëstretchí. Reduplication is sometimes accompanied by vowel ablaut (cad, cecid ëfailí; caed, cec˘d ëcutí) or nasal deletion (see section: Root vowel lengthening) (pung, pupug ëprickí; tund, tutud ëbeatí) or both (pang, pepig ëfastení; tang, tetig ëtouchí). Less regular examples include pell, pepul ëdriveí, sist, stit ësetí, gign, genu ëbegetí, ser, sêv ësowí,21 some of which involve reduplication of nonperfect forms and/or incremental u/v (see Section: Incremental u/v).
Root Vowel Lengthening: Regular examples include scab, scâb ëscratchí, ed, êd ëeatí, lav, lâv ëwashí. This vowel lengthening is sometimes accompanied by nasal deletion (vinc, v˘c ëconquerí; fund, fûd ëpourí; rump, rûp ëbreak, burstí; linqu, l˘qu ëleaveí) or by vowel ablaut (ag, êg ëdriveí) or by both (frang, frêg ëbreak, fragmentí; pang, pêg ëfix, fastení). Nasal Deletion: For a few verbs, Xí is achieved by what appears to be nasal deletion: find, fid ësplití; scind, scid ërend, tearí.22 In contrast, other quite similar verbs retain a nasal: -fend, -fend ëward offí, -cend, -cend ëkindleí,23 and others lengthen the vowel while deleting the nasal, as noted in the preceding section.
312 ! Byron W. Bender
Identity: For some verbs, the values of X and Xí are identical: vert ëturní, prehend ëseizeí, mand ëchewí, pand ëopení, sal ëleapí, vell ëpluckí, verr ësweepí, s˘d ësettleí, bib ëdrinkí, str˘d ëgrate, hissí, ˘c ëhití. A number of members of this group end in u/v, and could thus just as well be listed in the following section, with the increment having applied vacuously: tribu ëassigní, imbu ëgive a taste ofí, acu ësharpení, argu ëaccuseí, sternu ësneezeí, exu ëput offí, solv ëloose, payí, volv ëturn, rollí, and so forth. Incremental u/v: The X element receives a -u/v suffix to form Xí (a vowel/ glide element with u and v being in complementary distribution, depending on whether a consonant or vowel precedes): gem, gemu ëgroaní; vom, vomu ëvomití, strep, strepu ësoundí; tecs, tecsu ëweaveí; stert, stertu ësnoreí; al, alu ënourishí; col, colu ëtillí; mol, molu ëgrindí; occul, occulu ëhideí, and so forth. This is sometimes accompanied by nasal deletion and vowel lengthening (sin, s˘v ëpermití; lin, l˘v ~ lêv ëbesmearí), and by vowel ablaut as well, as the last variant shows. Incremental s: The X element receives an -s suffix to form Xí. This true consonant suffix involves a great deal of sandhi as it is juxtaposed to other consonants, with many of the alternations being automatic, something that is dealt with by the phonology and not reflected in strategies as conceived by Ford, Singh and Martohardjono (1997: 2). Examples include: nûb, nûps ëmarryí; scr˘b, scr˘ps ëwriteí; d˘c, d˘x ësayí; dûc dûx ëguideí; parc, pars ëspareí; f˘g, f˘x ëfixí; sûg, sûx ësuckí; sûm, sûmps ëtakeí; sculp, sculps ëcarveí; carp, carps ëpluckí; flect, flex ëbendí; nect, nex ëbind, weaveí; pect, pecs ëcombí. When velars or h follow, the root vowel is often lengthened: reg, rêx ëruleí, teg, têx ëcoverí; ang, ânx ëchokeí; fing, f˘nx ëmouldí; iung, iûnx ëjoiní; ung, ûnx ëanointí; trah, trâx ëdragí; veh, vêx ëdrawí, and so forth. The combination rg is an exception to this velar lengthening: merg, mers ëplungeí; terg, ters ëwipeí; sparg, spars ëscatterí, and so forth. An unclustered d preceded by a long vowel or diphthong deletes: râd, râs ëscrapeí; rôd, rôs ëgnawí, trûd, trûs ëthrustí; vâd, vâs ëgoí; laed, laes ëhurtí; claud, claus ëshutí; plaud, plaus ëapplaudí.
Miscellaneous: One additional phenomenon may be concomitant to the formation of Xí, and that is metathesis of the root vowel and a following r. In at least two verbs it is concomitant with the u/v
A Perfect Strategy for Latin ! 313
increment, root vowel lengthening and nasal deletion: spern, sprêv ëscorní; cern, crêv ëseparateí. In a third verb, vowel ablaut also occurs: stern, strâv ëstrewí. The verb ter, tr˘v ërubí involves all the same phenomena except for nasal deletion, and although ser, sêv might be explained by having the metathesized r delete upon being clustered with the initial s, another explanation involves nonperfect reduplication and rhotacism (see subheading Reduplication).
The Productive Process There is one additional relation between X and Xí, one that became the productive process of perfect formation for Latin verbs. It involves the suffixation of u/v (as in subheading Incremental u/v) but with the long stem (or ëthemeí) vowel of the conjugation intervening. The third conjugation was the only one that did not have a long stem vowel, but there are a handful of verbs that conformed to this pattern by lengthening the short i that is found just after the root in many inflected forms of this conjugation: arcess, arcess˘v ësummoní; capess, capess˘v ëundertake, seizeí; incess, incess˘v ëattackí; lacess, lacess˘v ëprovokeí; pet, pet˘v ëseekí; quaer, quaes˘v ëseekí; rud, rud˘v ëbrayí; ter, tr˘v ërubí. These are the only instances of this process for the third conjugation, unless two from the mixed (3a) conjugation are also included: cup, cup˘v ëdesireí; sap, sap˘v ëbe wiseí.
Second Conjugation: From the 2nd conjugation, there are only a handful: dêl, dêlêv ëdestroyí, fl, flêv ëweepí; n, nêv ësewí; vi, viêv ëplaití; compl, complêv ëfill upí. The majority pattern for the 2nd conjugation, that of mon, monu ëadviseí, was followed by some 70 verbs in all.24
Fourth Conjugation: Stem vowel + v added to X was the majority pattern of perfect formation in the 4th conjugation, with some 40 verbs like aud, aud˘v ëhearí. First Conjugation: Here the stem vowel was â, but the same pattern prevailed in even greater proportions in this large class of an estimated 360 simple verbs, where only a few followed other patterns. There is evidence of attrition among some of these latter, which developed variants that followed the majority pattern: cub, cubu ~ cubâv ëlie downí; nec, necu ~ necâv ëkillí; plic, plicu ~ plicâv ëfold upí; crep, crepu ~ crepâv ërattle, resoundí.
314 ! Byron W. Bender
Alternatives In the section on Morphological processes involved in perfect formation we looked at a variety of relations between X and Xí to be found in the 3rd conjugation, and in the section on the productive process noted yet another relation, which, although not prevalent in the 3rd conjugation, is clearly the majority pattern in the 1st and 4th conjugations. Table 12.7 plots all of these relations against the conjugations, giving approximate numbers of verbs that employ each process as the main or sole means of altering that portion of their form that remains constant throughout nonperfect inflections (X = their roots) to a portion that remains constant throughout the perfect inflections (Xí). Although the identity and incremental processes in the lower part of Table 12.7 could be handled with a single and constant X by increasing the material to the right in their perfect poles, processes in the top part of the table, such as reduplication, vowel lengthening, and ablaut, do not submit as readily to the same formalisms, or even those that vary material both to the left and right of the constant, as in (11), or those with more than one constant, as in (12). (11) /C1V1Xô/NONPERF ↔ /C1V1C1V1X˘/PERF cf. pendô ~ pepend˘, currô, cucurr˘, etc. (12) /XaYiô/NONPERF ↔ /XêY˘/PERF cf. capiô ~ cêp˘, iaciô ~ iêc˘, etc. Although I am using reciprocal strategies here for purposes of illustration, separate paradigmatic strategies would have to be set up for each group of verbs that shared a given means of perfect formation Table 12.7: Processes of Perfect Formation, by Conjugation Conjugation: 5.1.1 reduplication 5.1.2 root vowel lengthening 5.1.3, 5.1.7 miscellaneous 5.1.4 identity 5.1.5 u/v increment 5.1.6 s increment 5.2 stem vowel + v increment
1
2
3
2 2
4 8
19 10 2 38 17 38 9
12 350
6 70 21 5
3A
4 2
6
2 4 3
3 1 5 13 40
A Perfect Strategy for Latin ! 315
in each conjugation, resulting in a great proliferation of even paradigmatic strategies. Thus, the fact that the processes of perfect formation crosscut the conjugations to the extent that they do (as shown in Table 12.7) is the best argument I know of for separating out the two types of strategies by means of the Xí device, and having as it were strategies within strategiesóroot mutation strategies within the affixation strategies. This has the happy result of greatly reducing the number and complexity of the paradigmatic strategies needed. It should be noted that the domain of Xí coincides with the domain of the feature PERFECT. It is implied that speakers altered the two concomitantly. When one changed the feature PERFECT, one needed not only to engage in the usual shifts of affixation but also to perform a deeper alteration in what was otherwise held constant. In some instances, Xí is larger than X, and larger than it needs to be for a given group of verbs. The size of Xí has been determined by maximizing what all perfect inflections have in common, given in Table 12.8. Everything else specific to the perfect, but only for certain verbs has been subsumed under Xí.
The Perfect Strategy It turns out that the value of Xí will be for every verb what has traditionally been referred to as its perfect stem (see, for example, Aronoff 1994: 47ff). In order to inflect any verb for perfect aspect, this stem Table 12.8: A Perfect Strategy for All Conjugations All Conj
Plural
* +SPEAKER
FUT PRES PAST
+HEARER
FUT PRES PAST
NEITHER
FUT
(ñSPEAKER, ñHEARER)
PRES PAST
+Plural
Subj
+Subj
Subj
+Subj
Xíerô Xí˘ Xíeram Xíeris Xíist˘ Xíerâs Xíerit Xíit Xíerat
Ü Xíerim Xíissem Ü Xíer˘s Xíissês Ü ß Xíerit Xíisset
Xíerimus Xíimus Xíerâmus Xíeritis Xíistis Xíerâtis Xíerint Xíêrunt Xíerant
Ü Xíer˘mus Xíissêmus Ü Xíêr˘tis Xíissêtis Ü ß Xíerint Xíissent
316 ! Byron W. Bender
must be known. For the hundreds of regular 1st conjugation and 4th conjugation verbs, it can easily be derived from knowledge of any nonperfect form through the regularities expressed in (13): (13) 1ST CONJUGATION: /X/ NONPERF ↔ /Xâv/ = /Xí/ 4TH CONJUGATION: /X/ NONPERF ↔ /X˘v/ = /Xí/
PERF PERF
For any other verb, all of which must be considered irregular in their perfect formation, it is enough to know just one instance of the verb inflected for the perfect, from which Xí can be inferred by virtue of knowledge of the perfect strategy for all verbs given in Table 12.8. Strategies are, after all, based on knowledge of certain wordforms, and are used to transfer that knowledge to other wordforms.25 In order to inflect a verb that does not form its perfect stem regularly based on (13), it is necessary to know at least one of its perfect forms, from which all of the others can be formed using the strategies of Table 12.8.26 When any one perfect form is known, all the others can be inferred independently of any knowledge of nonperfect forms or their conjugational membership. In the perfect, there are no conjugations. Use of Xí permits us to leave that information behind. One paradigmatic strategy fits all.
Conclusion Use of paradigmatic strategies has permitted us to sift through a total of 450 inflected forms (90 each for regular verbs of the five conjugations) and make some interesting observations as to their patterning. The feature PERFECT, which affects verbs both at their peripheries and at their cores, appeared less tractable to strategies initially, but nevertheless has proved capable of insightful treatment by means of ëstrategies within strategiesí. Although this approach remains strictly a whole-word or ëseamlessí approach, it is noteworthy that the two morphological constants emerging from the strategy formalisms conform with traditional entities smaller than the word, one with the root, and the other with the perfect stem. What a speaker needed to know about the inflections of verbs of a given conjugation is summarized in Table 12.9 (for the 1st conjugation) plus the passive strategies of Table 12.6 and the perfect strategies of Table 12.8. Four more tables the size of Table 12.9 would
A Perfect Strategy for Latin ! 317
be needed, one each for the other three original conjugations and the mixed conjugation (3a).27 The same Tables 12.6 and 12.8 give passive and perfect inflections for all the conjugations. The additional information needed would be the value of Xí in Table 12.8. For several hundred regular verbs this is already known. For the others it would have to be learned. For any particular irregular verb, knowledge of one wordform inflected for the PERFECT would be sufficient. A bonus that adopting a strategizing perspective has accorded us is insight into how the active-passive strategies of Table 12.6 must have functioned to spread paradigm mixture across both voices wherever it occurred in the inflection of 3rd conjugation verbs in -iô. This is strong evidence for the psychological reality of these strategies. Table 12.9: Paradigmatic Strategy for 1st Conjugation Verbs Conj 1
Plural
* +SPEAKER
FUT PRES PAST
+HEARER
FUT PRES PAST
NEITHER
FUT
(SPEAKER, HEARER)
PRES PAST
+Plural
Subj
+Subj
Subj
+Subj
Xâbô Xô Xâbam Xâbis Xâs Xâbâs Xâbit Xat Xâbat
Ü Xem Xârem Ü Xês Xârês Ü Xet Xâret
Xâbimus Xâmus Xâbâmus Xâbitis Xâtis Xâbâtis Xâbunt Xant Xâbant
Ü Xêmus Xârêmus Ü Xêtis Xârêtis Ü Xent Xârent
APPENDIX 1: PARADIGMATIC STRATEGIES, CONJUGATION 2 PARADIGMATIC STRATEGY, CONJ
2 FUT PRES PAST
PASSIVE
+PERF
PERF
1SG ACTIVE
2ND CONJUGATION, 1SG
FUT PRES PAST
IND
SUBJ
IND
SUBJ
Xêbô Xeô Xêbam Xêbor Xeor Xêbar
Ü Xeam Xêrem Ü Xear Xêrer
Xíerô Xí˘ Xíeram
Ü Xíerim Xíissem *
(Contd.)
318 ! Byron W. Bender APPENDIX 1 (Contd.) PARADIGMATIC STRATEGY, CONJ
2 FUT PRES PAST
PASSIVE
FUT PRES PAST
PARADIGMATIC STRATEGY, CONJ
2 FUT PRES PAST
PASSIVE
FUT PRES PAST
PARADIGMATIC STRATEGY, CONJ
2 FUT PRES PAST
PASSIVE
FUT PRES PAST
PARADIGMATIC STRATEGY, CONJ
2 FUT PRES PAST
PASSIVE
FUT PRES PAST
PARADIGMATIC STRATEGY, CONJ
2 FUT PRES PAST
PASSIVE
SUBJ
Xêbis Xês Xêbâs Xêberis Xêris Xêbâris
Ü Xeâs Xêrês Ü Xeâris Xêrêris
Xíeris Xíist˘ Xíerâs
Ü Xíer˘s Xíissês
FUT PRES PAST
*
2ND CONJUGATION, 3SG +PERF
IND
SUBJ
IND
SUBJ
Xêbit Xet Xêbat Xêbitur Xêtur Xêbâtur
Ü Xeat Xeret Ü Xeâtur Xêrêtur
Xíerit Xíit Xíerat
Üß Xíerit Xíisset *
2ND CONJUGATION, 1PL +PERF
IND
SUBJ
IND
SUBJ
Xêbimus Xêmus Xêbâmus Xêbimur Xêmur Xêbâmur
Ü Xeâmus Xêrêmus Ü Xeâmur Xêrêmur
Xíerimus Xíimus Xíerâmus
Ü Xíer˘mus Xíissêmus *
2ND CONJUGATION, 2PL +PERF
IND
SUBJ
IND
SUBJ
Xêbitis Xêtis Xêbâtis Xêbimin˘ Xêmin˘ Xêbâmin˘
Ü Xeâtis Xêrêtis Ü Xeâmin˘ Xêrêmin˘
Xíeritis Xíistis Xíerâtis
Ü Xíêr˘tis Xíissêtis *
2ND CONJUGATION, 3PL +PERF
PERF
3PL ACTIVE
IND
PERF
2PL ACTIVE
SUBJ
PERF
1PL ACTIVE
IND
PERF
3SG ACTIVE
+PERF
PERF
2SG ACTIVE
2ND CONJUGATION, 2SG
IND
SUBJ
IND
SUBJ
Xêbunt Xent Xêbant Xêbuntur Xentur Xêbantur
Ü Xeant Xêrent Ü Xeantur Xêrentur
Xíerint Xíêrunt Xíerant
Ü ß Xíerint Xíissent *
A Perfect Strategy for Latin ! 319 APPENDIX 2: PARADIGMATIC STRATEGIES, CONJUGATIONS 3, 3A, & 4 PARADIGMATIC STRATEGIES, CONJUGATIONS
3, 3A, 4: 1SG +PERF
PERF
1SG ACTIVE
FUT3
3a 4 PRES3
3a 4 PAST3
3a 4 PASSIVE
FUT3
3a 4 PRES3
3a 4 PAST3
3a 4
IND
SUBJ
IND
SUBJ
Xam Xiam Xiam Xô Xiô Xiô Xêbam Xiêbam Xiêbam Xar Xiar Xiar Xor Xior Xior Xêbar Xiêbar Xiêbar
Ü ß Ü ß Ü ß Xam Xiam Xiam Xerem Xerem X˘rem Ü ß Ü ß Ü ß Xar Xiar Xiar Xerer Xerer X˘rer
Xíerô Xíerô Xíerô Xí˘ Xí˘ Xí˘ Xíeram Xíeram Xíeram
Ü Ü Ü Xíerim Xíerim Xíerim Xíissem Xíissem Xíissem
PARADIGMATIC STRATEGIES, CONJUGATIONS
3, 3A, 4: 2SG +PERF
PERF
2SG ACTIVE
FUT3
3a 4 PRES3
3a 4 PAST3
3a 4 PASSIVE
FUT3
3a 4 PRES3
3a 4 PAST3
3a 4
*
IND
SUBJ
IND
SUBJ
Xês Xiês Xiês Xis Xis X˘s Xêbâs Xiêbâs Xiêbâs Xêris Xiêris Xiêris Xeris Xeris X˘ris Xêbâris Xiêbâris Xiêbâris
Ü
Xíeris Xíeris Xíeris Xíist˘ Xíist˘ Xíist˘ Xíerâs Xíerâs Xíerâs
Ü
Xâs Xiâs Xiâs Xerês Xerês X˘rês Ü
Xâris Xiâris Xiâris Xerêris Xerêris X˘rêris
Xíer˘s Xíer˘s Xíer˘s Xíissês Xíissês Xíissês
*
(Contd.)
320 ! Byron W. Bender APPENDIX 2 (Contd.) PARADIGMATIC STRATEGIES, CONJUGATIONS
3, 3A, 4: 3SG +PERF
PERF
3SG FUT3
ACTIVE
3a 4 PRES3
3a 4 PAST3
3a 4 FUT3
PASSIVE
3a 4 PRES3
3a 4 PAST3
3a 4
IND
SUBJ
IND
SUBJ
Xet Xiet Xiet Xit Xit Xit Xêbat Xiêbat Xiêbat Xêtur Xiêtur Xiêtur Xitur Xitur X˘tur Xêbâtur Xiêbâtur Xiêbâtur
Ü
Xíerit Xíerit Xíerit Xíit Xíit Xíit Xíerat Xíerat Xíerat
Ü ß
Xat Xiat Xiat Xeret Xeret X˘ret Ü
Xâtur Xiâtur Xiâtur Xerêtur Xerêtur X˘rêtur
PARADIGMATIC STRATEGIES, CONJUGATIONS PERF
1PL ACTIVE
FUT3
3a 4 PRES3
3a 4 PAST3
3a 4 PASSIVE
FUT3
3a 4 PRES3
3a 4 PAST3
3a 4
Xíerit Xíerit Xíerit Xíisset Xíisset Xíisset
*
3, 3A, 4: 1PL +PERF
IND
SUBJ
IND
SUBJ
Xêmus Xiêmus Xiêmus Ximus Ximus X˘mus Xêbâmus Xiêbâmus Xiêbâmus Xêmur Xiêmur Xiêmur Ximur Ximur X˘mur Xêbâmur Xiêbâmur Xiêbâmur
Ü
Xíerimus Xíerimus Xíerimus Xíimus Xíimus Xíimus Xíerâmus Xíerâmus Xíerâmus
Ü
Xâmus Xiâmus Xiâmus Xerêmus Xerêmus X˘rêmus Ü
Xâmur Xiâmur Xiâmur Xerêmur Xerêmur X˘rêmur
Xíer˘mus Xíer˘mus Xíer˘mus Xíissêmus Xíissêmus Xíissêmus
*
A Perfect Strategy for Latin ! 321 PARADIGMATIC STRATEGIES, CONJUGATIONS
3, 3A, 4: 2PL +PERF
PERF
2PL ACTIVE
FUT3
3a 4 PRES3
3a 4 PAST3
3a 4 PASSIVE
FUT3
3a 4 PRES3
3a 4 PAST3
3a 4
IND
SUBJ
IND
SUBJ
Xêtis Xiêtis Xiêtis Xitis Xitis X˘tis Xêbâtis Xiêbâtis Xiêbâtis Xêmin˘ Xiêmin˘ Xiêmin˘ Ximin˘ Ximin˘ X˘min˘ Xêbâmin˘ Xiêbâmin˘ Xiêbâmin˘
Ü
Xíeritis Xíeritis Xíeritis Xíistis Xíistis Xíistis Xíerâtis Xíerâtis Xíerâtis
Ü
PARADIGMATIC STRATEGIES, CONJUGATIONS
Xâtis Xiâtis Xiâtis Xerêtis Xerêtis X˘rêtis Ü
Xâmin˘ Xiâmin˘ Xiâmin˘ Xerêmin˘ Xerêmin˘ X˘rêmin˘
ACTIVE
FUT3
3a 4 PRES3
3a 4 PAST3
3a 4 PASSIVE
FUT3
3a 4 PRES3
3a 4 PAST3
3a 4
*
3, 3A, 4: 3PL +PERF
PERF
3PL
Xíêr˘tis Xíêr˘tis Xíêr˘tis Xíissêtis Xíissêtis Xíissêtis
IND
SUBJ
IND
SUBJ
Xent Xient Xient Xunt Xiunt Xiunt Xêbant Xiêbant Xiêbant Xentur Xientur Xientur Xuntur Xiuntur Xiuntur Xêbantur Xiêbantur Xiêbantur
Ü
Xíerint Xíerint Xíerint Xíêrunt Xíêrunt Xíêrunt Xíerant Xíerant Xíerant
Ü ß
Xant Xiant Xiant Xerent Xerent X˘rent Ü
Xantur Xiantur Xiantur Xerentur Xerentur X˘rentur
Xíerint Xíerint Xíerint Xíissent Xíissent Xíissent
*
322 ! Byron W. Bender APPENDIX 3: CORNER STRATEGIES CONJ
1
FUT
PRES IND
PRES SUBJ
PAST IND
PAST SUBJ
CONJ
2
FUT
PRES IND
PRES SUBJ
PAST IND
PAST SUBJ
NONPERFECT ACTIVE
NONPERFECT
PASSIVE
PERFECT
SG
PL
SG
PL
SG
PL
Xâbô Xâbis* Xâbit Xô Xâs* Xat Xem Xês* Xet
Xâbimus Xâbitis Xâbunt Xâmus Xâtis Xant Xêmus Xêtis Xent
Xâbor Xâberis* Xâbitur Xor Xâris* Xâtur Xer Xêris* Xêtur
Xâbimur Xâbimin˘ Xâbuntur Xâmur Xâmin˘ Xantur Xêmur Xêmin˘ Xentur
Xíerô Xíeris* Xíerit Xí˘ Xíist˘* Xíit Xíerim Xíer˘s* Xíerit
Xíerimus Xíeritis Xíerint Xíimus Xíistis Xíêrunt Xíer˘mus Xíer˘tis Xíerint
Xâbam Xâbâs* Xâbat Xârem Xârês* Xâret
Xâbâmus Xâbatis Xâbant Xârêmus Xârêtis Xârent
Xâbar Xâbâris* Xâbatur Xârer Xârêris* Xâretur
Xâbâmur Xâbâmin˘ Xâbantur Xârêmur Xârêmin˘ Xârentur
Xíeram Xíerâs* Xíerat Xíissem Xíissês* Xíisset
Xíerâmus Xíerâtis Xíerant Xíissêmus Xíissêtis Xíissent
NONPERFECT ACTIVE
NONPERFECT PASSIVE
PERFECT
Xêbô Xêbis* Xêbit Xeô Xês* Xet Xem Xês* Xet Xêbam Xêbâs* Xêbat Xêrem Xêrês* Xêret
Xêbor Xêberis* Xêbitur Xeor Xêris* Xêtur Xer Xêris* Xêtur Xêbar Xêbâris* Xêbatur Xêrer Xêrêris* Xêretur
Xíerô Xíeris* Xíerit Xí˘ Xíist˘* Xíit Xíerim Xíer˘s* Xíerit Xíeram Xíerâs* Xíerat Xíissem Xíissês* Xíisset
Xêbimus Xêbitis Xêbunt Xêmus Xêtis Xent Xêmus Xêtis Xent Xêbâmus Xêbatis Xêbant Xêrêmus Xêrêtis Xêrent
Xêbimur Xêbimin˘ Xêbuntur Xêmur Xêmin˘ Xentur Xêmur Xêmin˘ Xentur Xêbâmur Xêbâmin˘ Xêbantur Xêrêmur Xêrêmin˘ Xêrentur
Xíerimus Xíeritis Xíerint Xíimus Xíistis Xíêrunt Xíer˘mus Xíer˘tis Xíerint Xíerâmus Xíerâtis Xíerant Xíissêmus Xíissêtis Xíissent
A Perfect Strategy for Latin ! 323 CONJ.
3 & 428 FUT 3 4 3 4 3 4
NONPERFECT ACTIVE
NONPERFECT PASSIVE
PERFECT
SG
SG
PL
SG
PL
Xar Xiar Xêris* Xiêris* Xêtur Xiêtur Xor Xior Xeris* X˘ris* Xitur X˘tur Xar Xiar Xâris* Xiâris* Xâtur Xiâtur Xêbar Xiêbar Xêbâris* Xiêbâris* Xêbatur Xiêbatur Xerer X˘rer Xerêris* X˘rêris* Xeretur X˘retur
Xêmur Xiêmur Xêmin˘ Xiêmin˘ Xentur Xientur Ximur X˘mur Ximin˘ X˘min˘ Xuntur Xiuntur Xâmur Xiâmur Xâmin˘ Xiâmin˘ Xantur Xiantur Xêbâmur Xiêbâmur Xêbâmin˘ Xiêbâmin˘ Xêbantur Xiêbantur Xerêmur X˘rêmur Xerêmin˘ X˘rêmin˘ Xerentur X˘rentur
Xíerô Xíerô Xíeris* Xíeris* Xíerit Xíerit Xí˘ Xí˘ Xíist˘* Xíist˘* Xíit Xíit Xíerim Xíerim Xíer˘s* Xíer˘s* Xíerit Xíerit Xíeram Xíeram Xíerâs* Xíerâs* Xíerat Xíerat Xíissem Xíissem Xíissês* Xíissês* Xíisset Xíisset
Xíerimus Xíerimus Xíeritis Xíeritis Xíerint Xíerint Xíimus Xíimus Xíistis Xíistis Xíêrunt Xíêrunt Xíer˘mus Xíer˘mus Xíer˘tis Xíer˘tis Xíerint Xíerint Xíerâmus Xíerâmus Xíerâtis Xíerâtis Xíerant Xíerant Xíissêmus Xíissêmus Xíissêtis Xíissêtis Xíissent Xíissent
PL
Xam Xêmus Xiam Xiêmus Xês* Xêtis Xiês* Xiêtis Xet Xent Xiet Xient PRES IND 3 Xô Ximus 4 Xiô X˘mus 3 Xis* Xitis 4 X˘s* X˘tis 3 Xit Xunt 4 Xit Xiunt PRES SUBJ 3 Xam Xâmus 4 Xiam Xiâmus 3 Xâs* Xâtis 4 Xiâs* Xiâtis 3 Xat Xant 4 Xiat Xiant PAST IND 3 Xêbam Xêbâmus 4 Xiêbam Xiêbâmus 3 Xêbâs* Xêbatis 4 Xiêbâs*Xiêbatis 3 Xêbat Xêbant 4 Xiêbat Xiêbant PAST SUBJ 3 Xerem Xerêmus 4 X˘rem X˘rêmus 3 Xerês* Xerêtis 4 X˘rês* X˘rêtis 3 Xeret Xerent 4 X˘ret X˘rent
Notes 1. I do not question the wisdom of this position as part of such a radical departure from the deeply ingrained Påƒinian tradition. There is much baggage to be jettisoned if we are to give whole-word morphology a full and fair hearing. 2. ëLexemeí or ëlexical itemí may be substituted for ëwordí in both roles.
324 ! Byron W. Bender 3. Latin presents certain advantages for anyone wishing to explore in depth the morphology of a fusional or flectional language (Matthews 1991: 169ff). They include not only ease of access to extensive written records, but also a historical-comparative perspective that facilitates the study of morphological change (not to mention absence of many alphabet and vocabulary obstacles, especially for those literate in English). 4. I use privative features here for both categories. 5. Bender (2000: 26ñ28) provides an explanation as to how morphological classes are created (and why they prevail over the phonologically-based alternatives that have been offered). The fact that paradigms present themselves in such cases in a small number of unmixed sets is most clearly set forth in the work of Carstairs (1987), where it is labelled ëThe Paradigm Economy Principleí. Readers are encouraged to consult Spencer (1991: 227ñ29), who elaborates on the Hungarian example provided by Carstairs. 6. The example could be expanded by inclusion of the 2nd and 3rd conjugations, although less dramatically, because of syncretism between these two conjugations in the imperfect. It could also be expanded with equal effect by the other person and number combinations. 7. The finite inflections of a regular Latin verb number 90 in all, 30 each for the nonperfect active and passive, and 30 for the perfect active. (The perfect passive is accomplished periphrasticallyórather than morphologicallyóby combinations of the participle and inflected forms of the verb sum ëI amí.) 8. The term ëpolesí is from Ford, Singh and Martohardjono (1997: 11). 9. X takes this value only if long vowels are treated as geminate (see Aronoff 1992). Otherwise, when vowels are treated as integral wholes (as implied by the macron notation), X has the same value as in (1), and the strategy reads as /Xat/ ↔ /Xâtis/. 10. It will be noted that I view the Latin ëperfectí as an aspect, and the ëimperfectí as a tense that I label here as ëpastí. The perfect passive quadrant in (5ñ10) is vacant of forms, because this combination is accomplished periphrastically, not morphologically. The asterisk, while not centrally located within the overall array, still serves as a reminder that all 15 poles are to be viewed as radially interconnected. The arrows mark instances of syncretism. 11. They are, however, included in Appendix 3. 12. Appendix 3 gives all the corner strategies for all conjugations for purposes of reference. 13. The resemblances of the first and second conjugations to each other are also great. Their corner strategies in Appendix 3 differ only by the following four strategies: XâCONJ1 ↔ XêCONJ2, XaCONJ1 ↔ XeCONJ2, XôCONJ1 ↔ XeôCONJ2, XoCONJ1 ↔ XeoCONJ2.
A Perfect Strategy for Latin ! 325 14. It is probably not coincidental that, according to markedness theory, this is the least marked point of the entire 90-cell paradigm. Because of the inverse relation between markedness and frequency, these forms would have been among the most frequent, and thus also among the earliest learned and the least easily forgotten, andówhat is of special relevance hereóthey were equally well members of either paradigm. 15. The 3a verbs tend to have light root syllables, while those of the 4th conjugation tend to have heavy root syllables or a succession of light ones (Mester 1994: 23ñ28). I wish to thank Philip Baldi for pointing me to this work and for helpful discussions on such matters, although it should be made clear that he bears no responsibility for my interpretations. 16. Or otherwise specially identified, as is the present active cell of the 3S table. 17. See Matthewsí (1991) Chapter 12 on iconicity, for an excellent discussion of this concept illustrated by both Italian and Latin. 18. Thus exemplifying the markedness notion of ëdevianceí, which says that greater irregularity is to be found within the unmarked category (Bender 1998a: 58), in this case, the singular. 19. Although strategies are bidirectional, I view this as the initial direction, because I see the mixture as originating in the active, with its ambiguous 3S endingóambiguous between the 3rd and 4th conjugations. 20. Mention is made of the root here not as a formal entity but simply to emphasize the depth of the category of aspect in the iconic scheme of the Latin verb. We are faced with a problem as to how to represent formally a most deep-seated manipulation that is part of inflection. 21. If one is aware of the s/r alternation termed ërhotacismí, it is possible to see reduplication in the NONPERFECT, and incremental u/v in the PERFECT for this word (Hale and Buck 1903: 109). 22. From historical-comparative perspective, this is actually failure to include the nasal infix present in the nonperfect forms. 23. The initial hyphen here indicates verbs that do not occur without a prefixal element, as for example dêfendô ërepelí. 24. Estimates of the numbers of verbs in each conjugation are from Greenough et al. (1983). 25. One could say that strategies are distillations of analogies, based on knowledge of particular wordforms, but of no word in particularó all that is required is that it be a member of the same paradigm, lest analogical changes take place. 26. Actually, it is necessary to know at least one of the perfect forms of even those verbs that do form their perfect stems regularly based on (13). This is how we identify them as such.
326 ! Byron W. Bender 27. These strategies are, in fact, already contained within the strategies of Appendices 1 and 2, and can be extracted simply by excluding their passive and perfect sectors, and by ëdeconflatingí what remains of Appendix 2 into three separate strategies. 28. Shaded cells give the pattern for the mixed (3a) conjugation. Even though there is no difference between 3 and 4 in the PERFECT, 4 is shaded there also in order to accentuate the original pattern within the entire table, and thereby call greater attention to the mixing that has taken place in the PRESENT INDICATIVE and the PAST (IMPERFECT) SUBJUNCTIVE. 29. Those verbs that now inflect according to conjugation 3a are thought to have earlier followed conjugation 4; mixture occurred at those places where they changed to a form identical with that of conjugation 3. 30. I use here the ë]í notation now being employed by ëseamlessí morphologists such as Starosta and Singh (in this volume), standing for the right edge of the word. 31. Except for the i of the FUT. bi and of the 3RD CONJ. 3S it], which do not lengthen.
References Aronoff, Mark. 1992. ëStems in Latin verbal morphologyí. In Morphology Now, ed. by Mark Aronoff, pp. 5ñ32. Albany: State University of New York Press. óóó. 1994. ëStems in Latin verbal morphologyí. In Morphology by Itself: Stems and Inflectional Classes, ed. by Mark Aronoff, pp. 31ñ59. Linguistic Inquiry Monograph No. 22. Cambridge, Mass.: MIT Press. Bender, Byron W. 1998a. ëMarkedness and iconicity: Some questionsí. In Case, Typology, and Grammar, ed. by Anna Siewierska and Jae Jung Song, pp. 57ñ70. Typological Studies in Language 38. Amsterdam: John Benjamins. óóó. 1998b. ëThe sign gravitates to the wordí. In Productivity and Creativity: Studies in General and Descriptive Linguistics in Honor of E.M. Uhlenbeck, ed. by Mark Janse with the assistance of An Verlinden, pp. 15ñ25. Berlin: Mouton de Gruyter. óóó. 2000. ëParadigms as rulesí. In Grammatical Analysis: Morphology, Syntax, and Semantics: Studies in Honor of Stanley Starosta, ed. by Videa P. De Guzman and Byron W. Bender, pp. 14ñ28. Oceanic Linguistics Special Publication No. 29. Honolulu: University of Hawaiëi Press.
A Perfect Strategy for Latin ! 327 Carstairs, Andrew. 1987. Allomorphy in Inflexion. Beckenham: Croom Helm. Ford, Alan, Rajendra Singh and Gita Martohardjono. 1997. Pace Påƒini: Toward a Word-based Morphology. American University Studies 13, Linguistics, Vol. 34. New York: Lang. Greenough, J.B., A.A. Howard, G.L. Kittredge, and Benj. L. DíOoge, eds. 1983 [1903]. Allen and Greenoughís New Latin Grammar. New Rochelle, N.Y.: Aristide D. Daratzas. Hale, William Gardner and Carl Darling Buck. 1903. A Latin Grammar. New York: Mentzer, Bush and Company. Matthews, P.H. 1974. Morphology: An Introduction to the Theory of Word-Structure. Cambridge: Cambridge University Press. óóó. 1991. Morphology. Second Edition. Cambridge: Cambridge University Press. Mester, R. Armin. 1994. ëThe quantitative trochee in Latiní. Natural Language and Linguistic Theory. 12: 1ñ61.
328 ! Danko ºSipka
13 Morphology in Minimal Information Grammar Danko ºSipka Introduction In this article, I will present several solutions in the morphological generator within the machine translation system called NeuroTranÒ . While morphology within this system is not seamless, it still has much in common with this idea. The system uses morphological boundaries if it can contribute to their efficiency, and it changes them if they are a hurdle. I will first present the Machine Translation system and its grammar rules, then the treatment of morphology within these grammar rules, and finally the minimization strategies, which at the same time show the role of morphological boundaries in the system.
NeuroTranÒ amd MIG NeuroTranÒ is a software programme developed by Translation Experts Ltd. (More about the company is available at: http:// www.tranexp.com). It is intended to ëdo things with wordsí. It is typically post-fordist and utilizes its knowledge base in various ways depending upon the specific options selected. It includes a morphological generator and analyzer, a dictionary with lexical lists extraction capability, sentence parsing and translation capabilities, as well as quantitative and qualitative analysis tool. In its sentence translation mode, the software acts as a bilingual transfer system. Since certain minimization strategies rely on transferring features and equivalents from one language to another, one could say that the project as a whole has some properties of a multilingual system.1
Morphology in Minimal Information Grammar ! 329
Crucial solutions within NeuroTranÒ are rooted in certain human cognitive facilities. When faced with the high complexity of their environment, people often reach to heuristics and schemata in an effort to reduce the resources required to process cognitively such data (Fiske and Taylor 1991; Kahneman, Slovic and Tversky 1982). We must first recognize the high complexity of the tasks NeuroTranÒ is intended to perform and then attempt to make it operate in a manner similar to the way in which human beings actually process information. Furthermore, we must at least attempt to utilize the properties of human cognitive processing in the process of preparing the knowledge base for the programme. NeuroTranÒ is based upon three crucial ideas. The first is that one needs to minimize to the greatest extent possible all the information required by the software to function and, then, allow it to acquire new information by reading texts and communicating directly with the user. The programme sucessfully accomplishes this by using artificial neural networks, that is to say, it starts as a ëcognitive miserí and then uses schemata to assimilate new pieces of information while at the same time adapting and changing old ones according to the new information. The second main idea behind NeuroTranÒ is that one needs to reduce the effort required by the lexicographer, again by requiring minimum information or input. In other words, the dictionary and programme creators function as ëcognitive misersí on behalf of the user. Finally, all NeuroTranÒ data needs to be reusable so that all knowledge bases and functions are usable in various situations and for various fields of endeavours as well as for different languages. NeuroTranÒ uses a set of formal rules we have named Minimal Information Grammar (MIG). These rules operate on the basis of a bilingual labelled lexical list (the list of equivalents with their respective grammar and usage labels together with frequency data, etc.) and a representative corpus for both languages in any given pairing. The architecture of MIG is subordinate to the fundamental ideas behind NeuroTranÒ . The grammar was named minimal because it reduces the information required for the software to perform its functions at a high level of speed and accuracy. It does this by: (a) balancing information in both the rules and the data it operates under; (b) using different classes of rules (constructors, mutators, selectors, etc.) to
330 ! Danko ºSipka
manipulate existing linguistic material; and (c) using artificial neural networks to provide new information as a direct result of the learning process which occurs anytime the software ëreadsí a text or communicates with a user. Existing information is virtually and continuously ërecycledí. MIG operates with the following classes of rules: 1. constructorsóuse dictionary labels to construct all possible forms of a word 2. mutatorsóchange already generated forms 3. findersófind the form or word required 4. definersódecide what is what 5. coordinatorsódetermine how one form coincides with others 6. choppersódivide larger units into smaller ones 7. bindersóunite smaller units into larger ones 8. transformersóreplace one word or form with another, for example, by translating a word in one language into a word in the other 9. countersókeep track of all statistics 10. doubtersódetect situations where there are more possibilities than would allow the programme to proceed 11. gamblersóchoose the solution that (based upon everything in the database) is the most probable even though other options remain viable l2. teachersóchange existing information (rules and figures) after reading different texts and translations 13. chattersóask the user when they need a piece of information or if user wants to change something 14. conductorsódirect the order in which the rules are to be applied. Every rule consists of a head (stating the input of the rule) and a body (providing details of how the output is calculated). This is represented in the Table 13.1 using an example of the English to Serbo-Croatian translation transfer rule for number-gender coordination between the nominal head and its adjective modifier. The entry in the labelled list of equivalents has the following structure: <entry><usage data>
labels>
;
Rule
Example
ENGSCR GRM big book ñ> velik knjiga N[ADJECTIVE çPRONOUN] NOUN => COPY(2>1:NUMBER, velik knjiga ñ> GENDER) velika knjiga
; . . . .
<equivalent 1><usage labels> <equivalent 2><usage labels> ... <equivalent n><usage labels> The text corpus data are attached to the programme as text with an index pointing from each form in the dictionary to each its starting and ending byte in the text.
MIG Morphology Morphological paradigms within MIG are basic information used by neural network to initiate and investigate all possible solutions in translation, parsing, and qualitative analysis. In order to generate a paradigm, MIG uses labels attached to the lexical entries in main dictionary text, and then applies a set of primarily constructor-type rules to the word bearing the label. It is well known that morphological generator and analyzer are in fact two sides of the same coin. Systems seeking minimality can thus develop one deriving the other from it. The idea of MIG is to develop the generator and to derive the analyzer from it. The following dictionary entry: selo,a n; [. . .]/village n; [. . .] is used to generate both its forms and the analyzerís entries:
332 ! Danko ºSipka Table 13.2: Derivation Summary Rule
SCR PARA *o, a n =>
NOUN; NEUTER; O1= (1ñ> ë,í ñ1);
SINGULAR; NOM=O1+o;
GEN=O1+a; DAT=O1+u; [. . .] PLURAL; NOM=O1+a;
Explanation
Head of the rule. If you find an entry ending in *o, a n; do the following (=>) Declare it a noun in neuter gender Its stem is the part until the comma minus one character Its singular forms are as follows: Nominative is the stem plus ëoí Genitive is the stem plus ëaí . . .
Generated forms
Generated analyzerís entries
sel
selo
selo, NSN [. . .]/ Village n; [. . .]
sela
sela, GSN [. . .]/ village n; [. . .] selu, DSN [. . .]/ village n; [. . .]
selu
sela
sela, NPN [. . .]/ village n; [. . .]
[. . .]
It is obvious that morphological rules enable arbitrary definition of the stem at any point, as it can be seen from the following SerboCroatian rule: SCR PARA *ati,*em,* iv; => VERB; TEMPLATE=INFINITIVE,1ST SG PRESENT,3RD ST PRESENT_iv(IMPERFECTIVE); EXAMPLE=orati,rem,ru iv; ACTIVE; O1=(1->ë,í); INFINITIVE=O1; PRESENT; AFFIRMATIVE;
Morphology in Minimal Information Grammar ! 333
O1=(1ñ>SAMEAS(1',í+1))+(1',íñ>2',íñ1); O2=(1ñ>SAMEAS(2',í+1))+(2',íñ>1' ë); SINGULAR; FIRST=O1+m; SECOND=O1+| s; THIRD=O1; PLURAL; FIRST=O1+mo; SECOND=O1+te; THIRD=O2;
Minimizing Strategies within MIG Morphology Every rule used in the generator consists of a head (stating the input of the rule) and a body (providing details of how the output is calculated) just like in any other MIG rule. Labels in the knowledge base consist of endings of the words starting from the point which allows generation of the paradigm using minimal length of the corresponding rule. For example, in the Serbo-Croatian knowledge base the verb slati, s| aljem, s| alju iv; (ësendí) has a much longer tag than orati,rem,ru iv; (ëploughí) whereby both follow under the *ati,*em,* iv; generator rule head. The morphological generator thus does not follow morphological segmentation, but simply looks for the simplest solution. Furthermore, the system has a specific treatment of alternations, using the same minimal-effort approach. Finally, the system has the option of semi-automatic tagging, which is applied to the lexeme as a whole. I will discuss these three strategies in turn. The fact that morphological segmentation is irrelevant can be exemplified using the Serbo-Croatian entires like pile ëchickí, tele ëcalfí, etc., which have the following status in traditional morphology: Case Nominative Singular: Genitive Singular: Nominative Plural Genitive Plural
Segmentation pil-e pilet-a pilad-0/piliΩc-i pilad-i/piliΩc-a
Explanation short stem long stem supletive stems supletive stems
334 ! Danko ºSipka
MIG however follows the simplest solution, and thus defines the stem for this noun paradigm only once, using the following rule: SCR PARA *e,eta n => NOUN; TEMPLATE=NOM SG, GEN SG_N (EUTER); EXAMPLE=pile, eta n; NEUTER; O1=(1ñ>ë,íñ1); SINGULAR; NOM=O1+e; GEN=O1+eta; DAT=O1+etu; ACC=O1+e; VOC=O1+e; INS=O1+etom; LOC=O1+etu; PLURAL; NOM=O1+ad||O1+iΩci; GEN=O1+adi||O1+iΩca; DAT=O1+adima||O1+iΩcima; ACC=O1+ad||O1+iΩce; VOC=O1+adi||O1+iΩci; INS=O1+adima||O1+iΩcima; LOC=O1+adima||O1+iΩcima The segmentation in MIG exibits the following differences in relation to the traditional one. Traditional
MIG
pil-e pilet-a pilad-0/piliΩc-i pilad-i/piliΩc-a
pil-e pil-eta pil-ad/pil-iΩci pil-adi/pil-iΩca
This example clearly demonstrates that adopting morphological boundries different than traditional leads into a shorter description, which in turn minimizes the resources needed to create the generator. Another minimization strategy is the combination of the constructor (presented in the Table 13.3) and the three mutators (presented
Morphology in Minimal Information Grammar ! 335 Table 13.3: The Constructors Constructor rule POL PARA *a, V1, V2 f => NOUN; FEMININE; O1=(1ñ>ë,íñ1); SINGULAR; NOM=O1+a; GEN=O1+V1; DAT=PAL(O1)+e; ACC=O1+e; INS=O1+a; LOC=PAL(O1)+e; VOC=O1+o; PLURAL; NOM=O1+V2;
GEN=OU(KEK(O1));
Explanation Head: if the entry ending like this is discovered Body: it is a feminine noun and its stem is the part preceding the comma with the final character deleted the Nominative Singular is constructed by adding ëaí to the stem In the Genitive Singular, the vowel after the first comma has been added to the stem In the Dative Singular, the mutator called PAL has been used
In the Locative Singular, the mutator called PAL has been used
In the Nominative Plural, the vowel following the second comma has been added to the stem In the Genitive Plural both, mutators OU and KEK have been applied to the stem
DAT=O1+om; ACC=O1+V2; INS=O1+ami; LOC=O1+ach; VOC=O1+V2
in the Table 13.4) to generate the inflection for a whole range of Polish feminine nouns while at the same time accounting for a broad range of both stem and ending alternations. This combination of rules is sufficient for such diverse examples as teczkaóGPl teczek (ëportfolioí), nogaóGPl n˙g (ëlegí); kobietaóGSg kobiety (ëwomaní) and aptekaóGPl aptek (ëpharmacyí). The labels contained in dictionary entries require only basic information which cannot be inferred from the form of the lexeme. All other information is inferred from the form of the entry and dealt with by coordinating constructors and mutators. The basic operating principle is that if the conditions are met for a mutator to be applied, then it changes the stem. If no such conditions are present, nothing happens. If we look at this rule applied
336 ! Danko ºSipka Table 13.4: The Mutators Mutator rule POL FUN PAL => ¡ PAL[O]=LAST[O][(t, d, r, sz, z rz, k, g, ch, l, p, b, w, m, n, s, z, c)=> (ci, dzi, rz, si, zi, zi, c, dz, sz, l, pi, bi, wi, mi, ni, si, zi, ci)] POL FUN OU => OU [O]=[O][(*KoK_)=> (*K˙K_)] POL FUN KEK =>KEK[O]=[O] [(*KK_)=>(*KeK_)]
Explanation Function PAL. If the last character of the stem is one of the characters before the => sign, then the function changes it into the one after that sign. Otherwise, nothing happens Function OU. If the stem ends in a sequence: consonant-ëoí-consonant, then this ëoí has been changed into ˙ Function KEK. If the stem ends in a sequence of two vowels, then ëeí has been inserted in between these two consonants
to the Genitive Plural, we can see that (in the case of the entry teczka,i,i,f ) the constructor generates the stem teczkóin the Genitive Plural, there are two consonants at the end of the stem and the mutator KEK inserts ëeí between them thereby providing teczek. The entry noga,i,i f; does not fulfil this criterion so there is no similar ëeí insertion. But the conditions to change ëoí into ë˙í are present so the mutator OU has been applied and the final Genitive Plural form becomes n˙g. Finally, the entry apteka,i,i f does not fulfil either of the criteria so the stem remains unchanged and the Genitive Plural becomes aptek. See Table 13.5. Finally, the third minimization strategy used by the system consists of using one part of dictionary entries to tag the others semiautomatically. The idea is to look for any number of characters at the end of an entry to find the shortest string which allows unambiguous morphological tag assignment. It is important to stress that longer strings are applied prior to their shorter counterparts, as we can see from the following Serbo-Croatian: SCR LABEL-ORDER => [. . .] ¼ s ov/,a m; lov/,a m; ov/,a,oó; tiv/,a m; hiv/,a m;
Morphology in Minimal Information Grammar ! 337 Table 13.5: Sample Derivations Rule
Examples
POL PARA *a,V1, V2 f => NOUN; FEMININE; O1=(1ñ>ë,íñ1); SINGULAR; NOM=O1+a; GEN=O1+V1; DAT=PAL(O1)+e; ACC=O1+e; INS=O1+a; LOC=PAL(O1)+e; VOC=O1+o; PLURAL; NOM=O1+V2; GEN=OU(KEK[O1]); DAT=O1+om; ACC=O1+V2; INS=O1+ami; LOC=O1+ach; VOC=O1+V2
kobieta
apteka
teczka
kobiet
aptek
teczk
noga nog
koza koz
kobieci+e aptec+e
teczc+e nodz+e kozi+e
kobieci+e aptec+e
teczc+e nodz+e kozi+e
kobiet
aptek
teczek
n˙g
k˙z
iv/,a,oó; [. . .] and Polish example: POL LABEL-ORDER => [. . .] edzia/,iego,iowie ## m; zia/,zi,zie f; baja/,i fsg; aja/,i,e f; eacja/,i,e f; dacja/,i,e f; ykcja/,i,e f; tycja/,i,e f; cja/,i fsg; [. . .] As it can be seen, this is applied to the lexeme as a whole, without taking into account any morphological boundries.
338 ! Danko ºSipka
Conclusion It has been demonstrated that in this system traditional morphological boundries have been used when they are needed and re-created when they are found not to be productive. This was the only possible way to achieve very practical goals of minimizing resources needed to create the system. At the same time, the goal of any scholarly research should be to find the shortest possible and the most accurate account of any phenomenon, which in turn means that morphological description should be short and accurate, be it seamless or not.
Note 1. I will here present only linguistic and general layout of the project skipping thus matters of programming, which I am not involved in. The programming of NeuroTranÒ and its tool called Dictman is conducted by Translation Experts programmers Dr Nenad Kon¼ s ar, Slawomir Pawlowski, and Vladimir ºSipka.
References Fiske S.T. and S.E. Taylor. 1991. Social Cognition. New York: McGraw-Hill. Kahneman D., P. Slovic and A. Tversky. 1982. Judgement under uncertainty: Heuristics and Biases. New York: Cambridge University Press.
About the Editors and Contributors ! 339
About the Editors and Contributors The Editors Rajendra Singh is Professor of Linguistics at the Université de Montréal. His research interests include phonology, morphology, sociolinguistics, language-contact and Modern Hindi. Some of his recent books are Lectures Against Sociolinguistics, Pace Påƒini: Towards a Word-based Theory of Morphology (co-authored with A. Ford and G. Martohardjono) and After Etymology: Towards a Substantivist Linguistics (co-authored with A. Ford and P. Dasgupta). He has also edited Towards a Critical Sociolinguistics and (with P. Dasgupta and J.K. Lele) Explorations in Indian Sociolinguistics. Stanley Starosta was, until his untimely death last year, Professor of Linguistics at the University of Hawaiíi. He was engaged in research and teaching in Asia and Europe. His areas of research included grammatical theory, natural language processing, and languages of the Pacific and South, Southeast and East Asia. His best known work is The Case for Lexicase (1988).
The Contributors Thomas Becker is Professor of Linguistics, University of Rostock, Germany. Byron W. Bender is Professor Emeritus of Linguistics at the University of Hawaiíi, Honolulu. Probal Dasgupta is Professor of Linguistics at the Université of Hyderabad. Alan Ford is Professor of Linguistics at the Université de Montréal. Koenraad Kuiper is Professor of Linguistics at the University of Canterbury, Christchurch, New Zealand. Siew Ai Ng is Professor of Chinese Studies at the National University of Singapore.
340 ! Explorations in Seamless Morphology
Franz Rainer is Professor of Romance Language at the Business University, Vienna. Robert Ratcliffe is Associate Professor of Arabic at the Tokyo University of Foreign Studies. Danko ºSipka is Associate Professor of Linguistics at the University of Poznan and Senior Linguist at M.R.M., USA. Zhiqian Wu is Professor of English at East China, Normal University, Shanghai.
Index ! 341
Index abstraction, 84 accent placement, theory, 59 accountability, 293 Accusative, 161 Accusative NP, 157 activeñpassive strategies, 69, 309, 310, 317 ëadjacency conditioní, 273 Adjective-Noun compounds, 131 adjectives, 240, 286, 288, 298 adverb, 169 affixes, affixation, affixation strategies, 23, 31, 32, 34, 36, 38, 54, 61, 62, 63n, 68, 74, 78, 94, 103, 119, 121, 171, 125, 126, 180, 198, 206, 213ñ14, 215ñ16, 219, 223, 225, 226, 227, 230, 231, 235, 236ñ37, 240, 242, 244ñ46, 248ñ49, 252, 257, 259, 260, 263n, 265n, 273, 274ñ78, 281, 296, 315; versus compounding, 68; derivational, 127, 181, 271; moraic, 235; substitution, 201; truncation, 20; verbal, 182; agent noun, 70, 158, 239 ëAggluinatingí, 117, 121 Akinlabi, Akinbiyi, 213, 226, 240 allomorphy, allomorphs, 24, 139ñ40, 219, 222, 244, 246, 247, 260, 275, 276 Americanism, 201 Amharic: default consonant, 255, 257 Amorphous Analogy alternative, 126ñ28 analogy, 197ñ98, 264n, 271, 274, 277, 293; non-grammatical, 242ñ43 Anderson, H., 29ñ30, 31 Anderson, Stephen, 67, 86ñ87, 118, 119, 122, 128, 131, 135, 138ñ39, 140, 141, 142, 143, 153, 220
anga, 72 aphasia, 37 aphorism, 45 apophony, 213, 216, 223, 224, 230, 244, 245, 260 Archangeli, Diana, 213, 218, 232, 240, 249, 256 Aristotle, 44, 58 Aronoff, M., 22, 24, 34, 216 Aspect model, 182 A¶¢adhyåyi, 66, 73, 74n atomism, 16 Augmentative, 78 Austronesian languages, 188 autonomous rule-component (ARC), 35, 37ñ38 Autosegmental Morphology Hypothesis, 212, 214ñ15, 216, 218, 221, 244ñ45, 249, 250, 260ñ61 back-formations, 21, 274, 277, 280 Bahasa Indonesia, 188 ëbahuvrihií, 124 Baker, Mark C., 53, 125, 149, 151 Bangla: compounds, 70, 77, 80ñ82, 84ñ85 base forms, 241; and derivatives, consistency, 200, 231; and the innovative, relation of contiguity, 206 Bat-el, Outi, 218, 221 Bauer, Laurie, 277 Bender, Byron W., 301, 302, 304, 305, 306 Berber: default consonant, 257 Bergsland, Knut, 52 Bhartrihari, 11, 66, 71, 78 bimoraic sequence, 233, 235, 242, 245
342 ! Explorations in Seamless Morphology ëblockingí, 34, 273 Bloomfield, L., 30, 66, 92, 101, 154, 155 Bogota, 201 Bohm, David, 44 Botha, Rudolf, 206 Buckley, Eugene, 255 Bybee, Joan L., 32, 43, 205 canonical nouns, 237, 238 canonical nouns stems, 237ñ38 canonicity, 239ñ40 Caplan, D., 36ñ37 Cardinal Geographic Direction (CGD), 287ñ89, 298 Carstairs, Andrew, 301, 303 caseñrelations, 158 causativization, 70 Chang, Hsun-huei, 108, 109 change consistency vs output consistency, 229ñ30 change invariant morphology, 247 Chao, Yuen Ren, 93ñ94, 97, 101, 103 Chilean: derivatives, 200 Chinese: Classical, 100; compounds, 180; polysyllabic words, 144n; word formation, 90ñ91, 93, 100, 103, 107; Word Formation Rules, 125 Choctaw: y-grade formations, 256 Chomsky, Noam A., 11, 25, 30, 118, 123ñ24, 125, 127, 140, 148, 150, 154, 182 coda, 55, 56, 214, 222, 228, 229, 236, 246, 250ñ51, 255, 256, 309 coda filling, 261 Columbia, 203 combining forms, 172ñ73 communication extraction capacity, 61 comparative/superlative forms, 240 compensatory lengthening, 261 complement, 54, 104 complement-case relations, 110, 184 complex nuclei, 228 component, 34 composites, 131ñ35, 293, 296, 298 compositionality, 272 compounds, compound formation, compounding process, 29, 58, 71,
72, 118, 149, 151ñ54, 155, 180, 182, 251, 271, 272ñ73; analogical, 140ñ41; endocentric, 281; as exceptions to pure A-morphous morphology, 121ñ23; containing bound morphemes, 135ñ40; derivational rule, 127; nonconcatenative, 274ñ77; phonological structure, 15, 141ñ42; problems, 124ñ26, 155; versus phrases, 131; verbs, 70 Comrie, Bernard, 181 concatenation, concatenative, 68, 251 concomitant, consequence, 69, 313 conjugations, 21, 28, 72, 262n, 302, 303, 304ñ7, 309, 313ñ14, 316ñ 17; intervening, 313 consistency, 230, 259ñ60 consonant, consonantal, 54, 55, 56, 78ñ79, 170, 214, 236ñ37, 238ñ 239, 241ñ47, 250, 251, 252, 253, 254ñ56, 257ñ59, 261, 312; deletion, 259; derivational, 177; root, 213, 216ñ20 constructor, 335, 336 ëconsumption ruleí, 169ñ70 contextual specifications, 299ñ300 contextual variants, 204, 206 ëContinuity Constraintí, 144n convergence, 16 core derivational morphology, 233ñ 36 corner strategies, 309 corpuscles, 44 correspondent, 158 Coseriu, Eugenio, 205 Cranberries, 99, 137ñ39 Dasgupta, Probal, 154 dative plural marker, 28 declensional classes, 21, 72 default directionality parameter, 214, 250ñ51, 261 defective and overlong, 241ñ43 deleting, 249ñ59 deliberate terminological obfuscation, 144n
Index ! 343 Dell, Francois, 218 derivation, derivational, derivatives, 21, 22, 31, 47, 68, 72, 73, 110, 140, 169, 170, 175, 180, 200, 205, 206, 213, 218, 222, 225, 226, 234, 237, 241, 243, 244ñ45, 247, 248, 249, 251, 252, 255, 259, 260, 264n, 271, 273, 286, 300; and composition, 22; mapping, 292, 294 Derivational Rules (DR), 107ñ8, 156, 275 dhatu, 72 Di Sciullo, A.M., 154 diachronic demotion, 85 diachronic derivations, 70 diachronic fragmentation process, 206 diachronic linguistics, 59, 61, 293 diachronic relationship, 198, 206 diachronic rule, 87 diachronic social generalization, 61 dialects, 43, 54 Dictman, 338n direction, directional spreading, problems, 249ñ51, 285; see also default directionality diversity, 68, 216, 217 Dowty, David, 87 Dressler, Wolfgang U., 30 Dutch: planar segmentation, 217 Elmedlaoui, Mohamed, 218 endocentric noun-noun compounding, 87 English, 54, 62; accent, 59; affixation, 121; apophony marking tense, 230; composites, 131; compounds, 78, 80, 86, 99, 138ñ39, 272; ó, NX-N, 135; ó, pseudo-compounds, 140; ó, right-headed, 95ñ96; Ernism 284, 286; irregular plurals, 129, 243; infectional affixes, 231; internal inflection, 134ñ35; Item and Arrangement, 117, 128; lexicon, 288, 297; morpheme, 56; morpheme structure condition (MSC), 56ñ57; morphology, 57,
69, 72; ó, consistency, 260; noun incorporation, 159, 185, 187; prefix-stem combination, 139, 140; passive verbs, 14; planar segmentation, 217; phonology, 69, 224; phonological well formedness condition, 55; plurals, 13, 32; productivity, 278; suffixation, 93; syllable an word pattern, 142; verbs, 173, 232; ó, classification, 158; word formation process, 223 Ernism, metatheory, 291ñ99 Erplusly, 286 etymology, 107, 144n, 292ñ93, 295, 296 evolution, 181 expression, 51 expressive power, 179 extension, 22 extraction capability, 328 extraction process, 221 extra-linguistic factors, 57 fillers, 34 filling defaults, 257ñ59 filters, 34 Fodor, J.A., 14 Ford, Alan, 11, 77, 78, 126, 279, 280, 299ñ300, 301, 304, 311, 312 formalism, 215, 314, 316 form-substance dichotomy, 297 Frajzyngier, Zygmunt, 228 French: compound words, 58; morpheme, 25; morphological strategies, 32, 58, 59; ó, paradigmatic, 275; noun, 53; phonological rule, 20 frequency factor, 278 Gauger Hans-Martin, 206 Gender, 46, 70 generalizations, 19, 27ñ29, 33, 48, 67, 71, 106ñ7, 125, 151, 181, 184, 187, 190, 216; cross-component, 125 generative grammar, 197 generative Phonology, 30 generativity, 179
344 ! Explorations in Seamless Morphology genitive singular marker, 28 Georgian: noun-compound composites, 133 German 39n; bidirectionality, 280; complex words, 281; compounds, compounding, 78, 86, 272, 274ñ75, 277; ó, N-X-N, 135; inflection classes 280; noun structure, 27ñ28; productivity, 278; structure, irrelevance 273; Word Formation Rules, 125 Gesamtbedeutung, 16, 198; vs polysemy, 203ñ6 Godsmith, John, 232, 233, 249 government binding theory, 151 grammar, 19; and analogy, 243 Grammaticalization, 78 Greco-Roman morphology, 66ñ67 Greece, 284 Greek: N-X-N compounds, 135 Greenlandic: Incorporated nominals, 51 Halle, M., 25, 30, 34, 35, 36, 216, 232, 254 Hammond, Michael, 219, 245, 256 Harrison, Shelly, 171, 174, 182 Hawaiíi, 138, 155, 184 headedness of compounds, 116ñ17, 123, 128, 133, 142ñ43; in Chinese, 90ñ91, 95ñ100, 104 Heath, Jaffrey, 218, 245ñ55 Hebrew, Bliblical; ó, directional defaults, 255; Modern, 213, 218; vowel lengthening, 264n Hendrick, Randall, 122, 149, 183 heteromorphemic, 263n Hindi: compounds, 77, 79ñ80, 85ñ 87; morphology, 70ñ71, 72ñ73 Hjelmslev, L., 24 Hobermann, Robert, 255 Hockett, Charles, 93 homograph, 92 homonymous, 99ñ100, 106, 198 homophonous words, 102, 132, 162 Hooper, Joan B., 30, 60ñ61, 161, 169 Hudson, G., 30 human cognitive processing, 329 Hungarian, 301
ëiambicí pattern, 238 Ibibio, 213; morphological consistency, 260; spreading, 227; syllabic pattern, 228; templatic morphology, 226, 230; verb roots, 240 Icelandic place names, 131, 132ñ33 iconicity, 309 identity, 312 ideographic writing system, 92 idiolects, 43 idioms, 135; in lexicase, 132 idiosyncrasy, idiosyncrasies, 212ñ13, 214, 217, 221, 224, 248, 253, 261, 262n, 263, 285, 286 Imdlawn Tashlhyt Berber, 218 imperfect tense marker, 27 inalterability, 262 incremental morphology, 312 in-deconcatenatable representations, 73 infixes, infixation, 72, 125, 126, 215, 216, 219, 221, 234, 235 inflectional/derivational distinction, 21 inflection, Inflectional, 27, 28, 34, 37, 50, 72, 73, 80, 124, 131, 230, 231, 270, 302, 304ñ5, 314, 317, 335; classes, 278; ó, rejection, 279ñ80; versus derivation, 68; internal, 179ñ80; process of, 21ñ 22; uniform treatment, 279; word formation, 301 in-laws, 86, 134ñ35 innovation, 22, 274 input and output, relationship, 228ñ 30 instantiation, 67 intermix, 303 Inuktitut, 51 Inuttitut, 46, 49, 52, 53 irregularity, 22 isomorphism, 24 Italian: word-formation, 275ñ76 item and arrangement (IA), 116, 156; hierarchical, 117ñ18, 128 Jakobson, Roman, 205 jargonaphasic, 38 Jensen, John Thayer, 188
Index ! 345 Kashmiri phonology, 70 Katz, J.J., 14 Kaye, Jonathan, 215 Keegan, John M., 220 Kelkar, Ashok, 66 Keller, L., 36ñ37 Khasi: morphological strategies, 69 Kiparsky, Paul, 29, 34, 254, 278 Koncar, Nenad, 338n Kosraean: noun incorporation, 172; ó, stranding, 189ñ90; transitive verbs, 161, 164 Kurylowicz, J., 23, 24, 29, 30, 31, 32 Kusaiean: noun incorporation, 159, 163; object incorporation, 162 Laca, Brenda, 205 Langlois, 33, 59ñ60 language acquisition, 14 language vision, 44ñ45 Lateral Cardinal Geographic Direction (LCGD), 289 Latin America, 202, 203 Latin, 39n; accent, 59 Lee, Keedong, 171 left to right (L-to-R) association default in Classical Arabic, 249ñ 55, 261, 264; in other languages, 255ñ56; universal, 257ñ59 lexeme, 23, 333 lexical analysis, 154ñ58 lexical component, split, 35 lexical, derivational, 128 Lexical Functional Grammar analysis of VR forms, 101ñ2, 108 lexical gaps, 108 Lexical Integrity, 289 lexical semantic defaults, 104 lexicalization, 48, 272 ëlexicaseí, 11, 13, 14, 132, 156, 157, 158, 163, 181, 184, 186; analogical derivation rules, 141; case-relation system, 110 lexicase dependency grammar, 108, 110, 118, 132, 160 lexicon, 13, 23, 34, 124, 156, 206, 216, 218, 224, 261, 274, 287, 301, 302; and morphology, 34ñ36
Li, Charles N., 94, 99, 108, 109ñ10 Lieber, Rochelle, 22, 27, 29, 53, 80, 95, 216 Lin, Fu-wen, 101, 102, 104, 105ñ7, 108, 109 linear alternation, 217 linear dimension, 221 linguistic competence, 19 linking elements, 277 loan words, 221 localization, 295 Locke, S., 36ñ37 Locus, 158; with internalized Patient in its scope, 184ñ87 Lombardi, Linda, 256 Lowenstamm, Jean, 215, 264n MP-rule, 39n, 61 Machine Translation system, 328 Mah∑bha¶ya I, 87 Malkiel, Yakov, 59 Mandarin: compounds, 90, 94, 100ñ 101; monosyllabic words, 93; word formation rules, 125 mapping, 219ñ20, 221ñ23, 226, 229, 235, 240, 244ñ45, 246ñ47, 250, 252, 256, 260, 292, 294; mappings, 83 Marantz, Alec, 151, 229 markedness and frequency, 325n Marshellese: LOC with internalized Patient, 184, 185ñ87; noun incorporation, 161ñ62, 165, 169; transitive verbs, 165; verb classes, 163, 165ñ69 Martohardjono, Gita, 36, 77, 126, 301, 304, 311, 312 masculine noun stems, 244 Masica, Colin P., 183 Matthews, P.H., 21, 22, 23, 24, 27 maxicanisms, 200 McCarthy, John, 212, 214, 215, 216, 217, 218, 221, 223, 224, 233, 234, 236ñ38, 242ñ44, 251, 253ñ54, 256, 259, 260 McCawley, James, 139 meaning, 23ñ25, 80ñ81, 82, 83, 84, 91ñ92, 101, 127, 128, 129, 132,
346 ! Explorations in Seamless Morphology 141, 152, 155, 158, 169, 171, 180, 200, 201, 204, 207, 219, 272, 275, 277, 294; abstract, 204ñ5, of VR words, 102ñ6 Means Subcategorization, 158 metaphoric, 198, 201, 207ñ9 metatheoretical power, excessive, 150, 152 metathesis, rule of, 222, 248 metonymical approximation, 198, 200, 201, 206; metaphoric and, 207ñ9 minimal signs, 272 Minimalist Programme, 79, 122, 149 minimality/maximality constraints, 242ñ44 minimization strategies, 328; within Minimal Information Grammar morphology, 333ñ37 Mithun, Marianne, 154, 189 Miwok: syllabic pattern, 228; Sierra, 249; ó, consonants and vowels, 217; ó, morphological consistency, 260; ó, templatic morphology 230, 232, 233, 249; ó, vowels quality, 227 Mobilese: noun incorporation, 172, 173ñ77 modification, 46, 52 modifier, 49, 54, 161, 189 Mohanan, K.P., 54ñ55, 56, 57, 157, 232 Mohawk: nominal incorporation, 53; noun incorporation, 189 Mokilese: noun incorporation, 172, 174, 182; verbs, 173, 177 Mon-Khmer languages, 183 monomorphemic substitute, 22 monophtongization, 61, 64n Monosyllabic Noun Vowel Lengthening Rule, 178 monosyllabic verbs, 226 Montréal, 11, 12, 13, 14, 16, 155 morae, 221, 224, 245 Morin, Yves-Charles, 33, 59ñ60, 61 Moroccan Arabic, 218 morphemes, 12, 15, 23, 24ñ26, 31, 32, 33, 34, 35, 54, 55ñ57, 64n,
72, 91, 93, 94, 103, 116ñ17, 118, 119, 135ñ40, 152, 154, 156, 171, 172, 180, 213, 216, 217ñ18, 219, 220, 221, 224, 231, 248ñ49, 251, 261, 262n, 284, 285, 287, 288ñ 89, 296; based alternatives, 157; compounds, 97ñ98, 99; as minimal unit, 272 morpheme structure condition (MSC), 54ñ55, 56ñ57 morphemology, 67, 72 morpholexical process, 29, 102 morphological, morphology, 30ñ31, 125; complexity, 72; concerning units smaller than the word, 23ñ 29; fragmentation, 47; naturalness, 278; non-compositional, 157, 180; operation, unity, 20; pattern based, 221, 222; relationships, typology, 231; role, 45; types, 21ñ23 morphonology, morphonlogical, 11, 15, 45, 57, 59, 61, 62 morphophoneme, morphophonemics, 11, 30, 31, 43 morphophonology, morphophonological, 20ñ21, 29ñ34, 61; processes, morphemic status, 24; rule as a phonological rule, generalization, 60, 61 morphophonomics, 30 morpho-syntactic features, 142ñ43, 302ñ3 morpho-syntax, 63n motivation, 290 Munda, 148; compounds, 135 mutators, 334ñ35 mutual exclusivity, 205 NSEW, a morpheme concept, 287, 289 nasal deletion, 311ñ12, 313 nativization, 251 neologisms, 37, 38, 55, 198, 200, 203, 207, 208, 272, 278 Neuro Tran and minimal information grammar (MIG), 328ñ31 neutralization, 33, 60
Index ! 347 nitya, 72 nityatva, 66 Niuean: noun incorporation, 154 nominal forms, 236 non-concatenative morphology, 212, 224 Noske, Roland, 217, 223 noun, nouns, 26, 218, 222, 235ñ36, 239, 240, 246, 252, 286; compounds, 124; monosyllabic, 178; overlong, re-evaluated, 253; ó, plurals, 253ñ54; phrases, 164; noun, 50; with semantic representation, 127 noun-headed dependents (ACTANTS), 158 noun incorporation, 45ñ52, 53, 125, 135, 136; hierarchy of, 180ñ81; lexical analyses/implementation, 154ñ55, 157; morphology of, 171ñ79; seamless approach/ analysis, 155ñ56, 157ñ58, 169ñ 71; ó, advantages, 179ñ81; ó, disadvantages, 181ñ84 nucleus, 229, 309 null-head, 189 number, 46ñ47 object construction, 171, 174, 176 object incorporation, 110, 162, 164, 187, 190 Obligatory Contour Principal (OCP), 254 Onondaga: noun incorporation, 154 onset filing, 259, 261ñ62; and coda filing in Arabic, 251ñ53 Optimality Theory, 232 orthography, 100, 106, 216 output, 222, 226, 277; acceptability, 279; consistency, 231; invariant morphology, 231 template, 260, 263n; variable morphology, 247 overgeneration, 206 pada-categorematicity, 80 Pagotto, Louise, 161, 165, 172, 184 PainUinians, 36
ëpalatabilityí, 144n Pandolfi, Ana Maria, 204ñ5 Påƒini, 16, 39n, 66, 67, 69, 71ñ72, 73, 74n, 78, 79, 80, 86, 87, 123, 153, 324n ëParadigm Economy Principleí, 301, 324n Paradigm Structure Condition(s), 72 Paradigmatic Rule Format, 270ñ72 paradigmaticity, paradigmatic strategies, 279, 301, 303ñ7, 309, 311, 315, 316 Paradigms, 289, 301, 333, 334; intraextra-paradigmatic, 21; an inherent part of language design, 302ñ3; mixture of 3rd and 4th conjugations, 307ñ10 Paragoge rule, 31 Parsimony, 288ñ89 passive voice formation, 230, 309, 310; see also activeñpassive strategies past tense formation, 230, 241 Patanjali, 66 Patient, 110, 132, 158, 180ñ81, 184ñ 87 Paul, Hermann, 271, 278 Pawlowski, Slawomir, 338n perfect formation, 310ñ11, 315ñ16; morphological 311ñ13 Pero: morphological consistency, 260; plural verb formation, 228; stem, 228; templatic morphology, 230 phonemes, 12, 127, 128, 155, 175 phonetics, 44, 297 phonology, phonological, 11, 13, 20, 30, 31, 32, 35, 44, 45, 53ñ54, 57, 61, 62, 232, 241, 250ñ51, 261, 285ñ86, 294, 310; and morphology, 20, 53, 74n, 216, 217, 219ñ 24, 236, 271; quantity and quality, 229ñ30; unpredictability, 150ñ51, 152 phonologization, 33, 39n, 60 phonotactics, 20 phonotactics well-formedness, 13 phrasal idiom approach, 134
348 ! Explorations in Seamless Morphology phrase, phrases, 92, 131; benefactive classfier, 190 phrase structure rules, 124, 125 ëphysical impactí, 204 Picard, M., 33 planar V/C segregation, 224, 262n pluralization process, 32 plurals, 29, 253ñ54, 260, 277, 278, 309; broken, 239, 240, 244ñ48, 264n; irregular, 128ñ31; morphological structure, 54 plural formation, 26, 219ñ20, 221, 222ñ23, 235, 241ñ42, 243, 244ñ45 Pohnpeian: free verbs, 173; noun incorporation 159, 163, 172ñ73, 177ñ78; object incorporation, 162 Polar Cardinal Geographic Direction (PCGD), 289, 290 Polar moiety, 289 Polish: feminine nouns, 335 Polynesian, 182 ëpolysemyí, 162 populous stems, 238, 242 possessed-possessor construction, 50 possession, 46, 49ñ51 pragmatics, 78 pratyaya, 18, 72, 86 predicate-argument structure of VR Compounds, 102, 103 prefix, prefixes, 31, 71, 72, 86, 139, 219, 230, 235ñ36, 256; stem combination, 139ñ40 Prince, Alan, 215, 216, 219, 221, 224, 233, 237ñ38, 242ñ44, 253ñ54, 259, 260 Priscian, 27 productive process, 313 productivity, 57, 198, 280ñ81; and the role of models, 277ñ79; variable, 150, 152 Projection Morphology, 17n pronunceability, 13, 56, 157 pronunciation possibilities, 56, 158 prosodic, 215, 221, 244, 245ñ46, 250ñ51 prosodic circumscription, 219ñ20, 235, 236, 238, 263n
Prosodic Morphology Hypothesis (PMH), 212, 215, 229, 234 pseudo-affix, 257, 259, 261 pseudo-compounds, 135ñ36 pseudo-words, 83 psycholinguistic, 14 quantity based morphology, 244, 251, 257ñ59, 261 Quebec French, 33 Quine, Willard van Orman, 80 radicals, 27 Ratcliffe, 220, 232, 240 ëreciprocal strategiesí, 304, 306, 309, 310, 315 redundancy rules, 32ñ33, 35, 49, 60 reduplication, 173, 174, 216, 224, 229, 230, 231, 311, 314 reference, 46, 51ñ62 Rehg, Kenneth L., 172ñ73, 178 relative clauses, 161 Rely, 287 Renaissance, 197ñ99, 200, 201 rhotacism, 313 romance stratum, 284, 297 roots, 23, 27, 28, 29, 78, 80, 94, 241, 242ñ43, 253, 261, 313, 316; based analysis, 219; autosegment, 250; biconsonantal interpretation, 254ñ55; extraction vs base extraction, 223; germination, 253; as morpheme approach, 221; mutation, 310, 315; and pattern morphology of Arabic, 16, 212ñ15, 219, 221, 223, 224, 229; syllables, 325; vowel lengthening, 311, 313, 314 Rosen, S., 154, 188 rule-based morphology, 36ñ38, 232, 260, 329ñ35 Russian, 205; non compound composites, 133 Sadock, Jerrold M., 46ñ49, 51ñ52, 149, 150, 151 Sakatyayana, 15
Index ! 349 sandhi, 92, 312 Sanskrit: morphology, 71ñ72 Sapir, Edward, 154, 155 Saussure, 66, 270, 292ñ93, 297ñ98; post-Saussurean linguists, 294ñ95 schema, instantiations, 67 schemata, 329 seamless, seamlessness, 11, 12, 66; wholes, 12 segmentation, 26, 161, 223, 290, 333, 334; and classification, 24, 153; horizontal, 220, 221, 222, 225, 227, 247, 259; vertical, 221, 222, 247; word-based, 222 Selkirk Elizabeth, 34, 36, 119, 123 semantic semantics, 50, 80, 82, 141, 205, 232, 262n, 277, 285, 296, 297, 299, 302; defaults, 104; deviation, 108; effect of noun incorporation, 171; features, 129ñ30, 140; independence, 80, 82; noncompositionality, 92, 131; phonetic correlations, 84; pragmatic autonomy, 51; substance, 298; transitivity, 161ñ62; transparency, 278; unpredictability, 151ñ52 semi-automatic tagging, 333 Semitic languages, 16, 212, 214, 238, 249, 255, 260 sentences, 154, 168 Serbo-Croation, 330ñ32, 336 shape, 129, 155, 172, 179, 222, 231, 239, 240 shape invariant morphology, 213ñ15, 216, 217, 218, 220, 222, 223, 225ñ42 Siegel Dorothy, 273 Sierra Miwok, see Miwok, Sierra Singh, Rajendra, 77, 78, 126, 154, 279, 299ñ300, 301, 304, 311, 312 singular, 46, 222, 244, 245ñ47, 277, 279 Sino-prefixing, 136ñ37 Smith, Norval, 256 Sohn, Ho-min, 162ñ63, 169 sonables, 12 sonority, 221, 222, 224; consonant 257, 259
Sora: noun incorporation, 148, 183; syntactic analysis, 150ñ52 sounds, 83 South Asian languages: morphological strategies, 69ñ72 Southern Tina, 46 Spanish, 32, 43; accent, 59; acquisition and the diachrony, 32; noun, 33, 61, 64n speaking, speaker, 271, 272, 293ñ 94, 296; knowledge, 68 speech, 54, 302; errors, 14; óand integral listing, 36ñ38 Spencer, Andrew, 224, 277, 303 spirantization, 226 Staib, Bruno, 205 Stampe, David, 12 Starosta, Stanley, 11, 15, 77, 154, 186 statability, 15 stem, stems, 23, 27, 28, 29, 34, 48, 50, 54, 78, 80, 94, 102, 118, 139, 143, 155, 180, 218, 219, 222, 225, 235, 236, 239, 241ñ43, 244ñ45, 247, 248, 250, 252, 256, 257, 259, 260, 270, 280, 281, 334, 336; consonantal, 241; morphemes, 35; neologistic, 37; non-past, 262n; versus principal parts, 119ñ 21; tri-consonantal, 118, 121; vowel, 219, 227, 230 Stonham, John T., 216 stranding, 187ñ90 strata, 34 structuralism, 23, 36, 58, 197, 293, 298 sub-categorization, 158 substitution, 32 suffixes, suffixation, 18, 29ñ30, 34, 46, 53, 54, 55, 56, 72, 79ñ80, 93, 99, 126, 163, 171, 173, 178, 181, 182, 219, 226, 227, 231, 235ñ36, 240, 248ñ49, 252, 275, 277, 313; in verbs, 102ñ3, 105ñ6, 109, 111; derivational, 52; medializing, 53 Sugita, Hiroshi, 162, 163, 165 sword-formation process, 22ñ23, 60, 154 syllabic/moraic structure, 231
350 ! Explorations in Seamless Morphology syllabification, 260 syllable patterns, syllabification, 55, 215, 220ñ21, 222, 224, 225ñ26, 228, 229, 235, 236, 237, 239, 240, 245ñ47, 250ñ51, 260, 262n, 263n, 309 synchronic system, 61, 82, 99, 293, 296 syncretism, 72 synonymous, 46 syntactic: and anaphoric irregularities, 179; categories in traditional Chinese, 96ñ98; considerations, 310; distribution, 93, 158; motivation, 22; operations, 47, 48, 50, 53; phrases, 272; rules, 156; theory, 11; typology, 183; word classes, 93, 95, 132 Syntagmatic, 295; iconicity, 309 syntax, 12, 45, 53, 70, 77, 84, 124, 125, 141, 150, 183, 189, 216, 217, 262n, 274, 286, 294, 296; autolexical, 150; pre-lexical, 45, 47, 48 Syriac directional defaults, 255 ësystem dependentí naturalness, 278 Tashlyht: morphological consistency, 260; Berber, directional defaults, 256 taxonomic methodology, 24 Taylor, Harvey, 11 Taylor, Henry, 11 templates, templatic morphology, see shape invariant morphology terminological standardization, 16 Thai: patient incorporation, 181 theme, prominence of, 110 Theta-criterion, 108ñ11 Thieme, Paul, 66 Thompson, Sandra A., 94, 99, 107, 161, 169 Tigrinya: L to R spreading, 255 timing slots, 249ñ59 tonal association, 250 tractability, 291 transformation, transformational rules, 150, 156, 157, 179
transitivity continuum, 161ñ62, 169; generative analysis, 162ñ63 Translation Experts Ltd., 328 trichotomy, 35 Trubetzkoy, N.S., 30 Trukese, 162 truncation, 32 Turkish, 117; suffixes, 231 typology, intra-extra-pragmatic, 21 Ukrainian, 31 Ulrich, Charles H., 256 umlaut, 29ñ30, 32, 33, 34, 64, 280 uniqueness, 281 universal grammar, 303 Urua, Eno, 213, 226, 240 usage, 204ñ5 utterances, 37, 272, 294 Vakyapadiya III, 87 Valdivieso, Humberto, 204ñ5 variability, 294 variation and change, 221 Varin, Èarie-Eve, 33, 59ñ60 Venezuela: derivatives, 200, 202 Vennemann, T., 30 verb, verbs, 32, 37, 45ñ46, 49, 50, 51ñ53, 101, 125, 139ñ40, 148, 149, 154, 161ñ63, 218, 225, 230, 235ñ36, 241, 252, 254, 260, 263n; classes and Micronesian transitivity, 163; classification, 158, 228; complex, 275; compliment, 46, 49, 110; etymological, 173; incorporated, 168; inflection, 301ñ2; intransitive, 110, 166, 172ñ73, 174ñ77, 187; in Latin, 301, 304, 309, 310, 313, 316; monosyllabic, 228; neutral, 162, 169; resultative, 125; semitransitive, 162; single argument, 167; stem, 155, 240; stringing noun, 170ñ71; synchronic, 173; transitive, 165ñ 66, 169, 173, 177; óbi-transitive, 174, 175; unincorporating, 167; see also noun incorporation vibhakti, 72 vikar, 71, 86
Index ! 351 vikari, 71, 85 vocalic alteration, 43 vocalic epenthesis, 31 vocalism, 230, 262n voicing process, 32 vowel, vowels, 216, 230, 237, 239, 241, 250ñ51, 261, 309, 310, 312, 313; change, 274; and consonants, 213, 216ñ18, 220ñ24, 226ñ 27, 246, 247; harmony, 34; lengthening, 178, 256; quality, 218, 248; variation, 222, 226 well-formedness conditions (WFC), 31, 33, 49, 54, 55, 59, 60, 217, 250 Williams, Edwin, 95, 123, 154 Wittgenstein, Ludwig, 16 Woleaian: noun incorporation, 159; transitive verbs, 169 word, words, 20, 36, 38, 51, 54, 66, 67, 156, 292ñ93; autonomy, 273; existing and new, morphological relations, 271; finding, 36ñ38; formal relationships, 15, 18ñ21, 35; headedness, 274; internal grammatical structure, 15, 119; internal spaces, 172, 174; based morphemes, 261; based analyses vs morpheme-based item and arrangement/item and process analysis, 152ñ54; based morphology, 21, 244; role from a paradigmatic point of view, 271ñ72; parts, 26, 27; relation, 271; stress pattern characteristic, 92; structure, 285; óirrelevance, 273ñ74; óand phrase structure, 181ñ83; syntax, 231, 270, 271, 274, 276 word formation, 50, 261, 272, 275, 277; derivational, 29; syntactic
operation, 63n; word-based approach, 155 Word Formation Rules (WFRs), 32, 119, 122, 123ñ25, 133, 136, 139, 141, 143, 153, 270, 275 word formation strategies (WFSs), 12ñ15, 16, 19ñ20, 34, 35, 36, 43, 56, 58, 59, 62, 67ñ68, 72, 78, 79, 80, 86, 87, 120ñ21, 126ñ30, 134, 136ñ37, 140, 156, 158, 163, 169ñ 72, 177, 179, 260; constraints, 156ñ57 word and paradigm morphology (WP), 116, 118, 121, 123, 270, 280 Word Structure Rules, 122, 123ñ24, 135, 137 writer, 270ñ71 Wurzel, W.U., 27 X-bar theory, 182 Xless, 22 Y-grade formation, 256 Yapese: noun incorporation, 161; ó, stranding, 187ñ88, 190; transitive verbs 164 Yawelmani, 232, 256; autosegmental morphology, 213, 217, 218; default consonant, 257; morphological consistency, 260; syllabic pattern, 228 Yawelmani Yokuts, 249; stem template, 227 Yip, 214, 253, 259 ëzeroí, as a morphological mark, 45, 58 zero-morpheme, 24 zero-suffix, 70