A TAJIK PERSIAN REFERENCE GRAMMAR BY
JOHN R. PERRY
BRILL LEIDEN-BOSTON 2005
CONTENTS Tables and Charts
Xlll
Prefa...
845 downloads
3719 Views
12MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
A TAJIK PERSIAN REFERENCE GRAMMAR BY
JOHN R. PERRY
BRILL LEIDEN-BOSTON 2005
CONTENTS Tables and Charts
Xlll
Preface
XV
Introduction CHAPTER ONE. Phonology and Orthography 1.1 Integration of Sound and Script PHONOLOGY 1.2 Vowels (1): Stable and Unstable 1.3 Vowels (2): Individual Qualities 1.4 Vowels (3): Lowering, Glides, Diphthongs 1.5 Consonants
1 13 13 15 15 17 20 22
MORPHOPHONOLOGY 1.6 Syllables and Stress 1.7 Phonotactics 1.8 Alternation and Suppletion
25 25 28 29
ORTHOGRAPHY 1.9 Writing Systems: Introduction 1.10 Cyrillic (1): General 1.11 Cyrillic (2): Consonants 1.12 Cyrillic (3): Vowels and Semi-vowels 1.13 Perso-Arabic (1): General 1.14 Perso-Arabic (2): Vowels 1.15 Morphographics 1.16 Segmentation and Punctuation
33 33 35 38 39 43 47 51 56
CHAPTER TWO. Morphology: Nominals 2.1 General Observations NOUNS 2.2 Gender 2.3 Gender and Age
61 61 61 61 62
VI
CONTENTS
2.4 2.5 2.6 2.7 2.8 2.9
Number (1) Number (2) Number (3) Definiteness and Specificity (1) Definiteness and Specificity (2) Definiteness and Specificity (3)
IZOFAT AND -RO 2.10 The izofat Constructions: Common Features 2.11 Adjectival izofat 2.12 Nominal izofat (1} 2.13 Nominal izofat (2) 2.14 Nominal izofat (3) 2.15 Nominal izofat (4) 2.16 Particular izofat. Structures 2.17 The Enclitic -ro 2.18 Other Uses of-ro
63 64 65 66 69 71 71 71 73 74 76 77 77 78 79 82
ADPOSITIONS 2.19 Prepositions: Simple 2.20 Prepositions: Derived 2.21 Prepositional Phrases (1) 2.22 Prepositional Phrases (2) 2.23 Postpositions 2.24 Postpositions of Opportunity 2.25 Circumpositions 2.26 The Vocative
84 84 91 93 99 101 103 105 106
PRONOMINALS 2.27 Personal Pronouns: Forms 2.28 Personal Pronouns: Functions 2.29 Pronominal Enclitics: Forms 2.30 Pronominal Enclitics: Functions 2.31 Demonstratives 2.32 Reflexive and Emphatic Pronouns 2.33 'Other', and Reciprocal Pronouns 2.34 Interrogatives 2.35 Interrrogative Phrases 2.36 Indefinite and Specific Pronouns: 'Some —'
107 107 110 112 113 117 119 121 123 126 127
CONTENTS
VII
2.37 Indefinite Pronouns and Adjectives 2.38 Universal Pronouns: 'Each, All, None; One'
129 130
ADJECTIVES 2.39 General Features 2.40 Attributive Functions 2.41 Predicative Functions 2.42 Comparison of Adjectives 2.43 The Superlative 2.44 Similes, Intensives, Attenuatives 2.45 Quantifiers: 'Much' and 'Little'
133 133 135 137 139 142 144 147
ADVERBS 2.46 Adverbs (1): General; Place and Time 2.47 Adverbs (2): Degree and Manner. 2.48 Adverbs (3): Compound and Phrasal
148 148 154 158
NUMERALS 2.49 Cardinal Numbers 2.50 Number Phrases (1) 2.51 Number Phrases (2) 2.52 Ordinal Numbers 2.53 Numerical Expressions 2.54 Days, Dates, Time 2.55 Everyday Mathematics
161 161 163 165 166 168 171 174
CHAPTER THREE. Morphology: Verbs
177
VERB STRUCTURE 3.1 Overview 3.2 Stem Classes (1) 3.3 Stem Classes (2) 3.4 Personal Inflections 3.5 Prefixes 3.6 TheVerb'ToBe'(l) 3.7 The Verb 'To Be' (2) 3.8 The Verb'To Have'
177 177 182 183 194 197 199 203 206
CONJUGATIONS: SIMPLE 3.9 Tenses from the Aorist
208 208
VIII
CONTENTS
3.10 3.11 3.12 3.13
Present Indicative: Forms Present Indicative: Functions Simple Past Imperfect
209 211 212 214
CONJUGATIONS: COMPOUND
216
3.14 3.15 3.16 3.17
216 217 219 221
Definite Future Perfect Indicative Pluperfect Indicative Stative Verbs
PROGRESSIVE TENSES
223
3.18 Present Progressive 3.19 Past Progressive 3.20 Other Progressive Constructions
223 225 226
NON-WITNESSED MODE 3.21 The Perfect as a Non-Witnessed Form 3.22 Non-Witnessed Durative 3.23 Non-Witnessed Past 3.24 Non-Witnessed Past Progressive
227 227 229 231 233
THE SUBJUNCTIVE
234
3.25 3.26 3.27 3.28 3.29
234 236 237 239 240
Present Subjunctive Past Subjunctive Durative Past Subjunctive Present Progressive Subjunctive Imperative and Optative
CONJECTURAL MOOD
243
3.30 3.31 3.32 3.33
243 244 245 246
The Conjectural Mood: Introduction Past Conjectural Present-Future Conjectural Present Progressive Conjectural
PASSIVE VOICE 3.34 Passive Voice: Forms 3.35 Passive Voice: Function (1) 3.36 Passive Voice: Function (2)
247 247 249 251
CONTENTS
IX
NON-FINITE FORMS
253
3.37 3.38 3.39 3.40 3.41 3.42 3.43 3.44 3.45 3.46
253 256 258 260 263 264 267 271 274 276
Infinitives Other Nouns of Action and Activity Verbal Adjectives and Adverbs Participles: General Present Participle [kunanda] Future Participle [kardarii] Past Participles I and II Past Participle II [kardagi] Present Progressive Participle [karda isiodal -gT] Present-Future Participle [mekardagT]
CHAPTER FOUR. Syntax
279
PHRASE AND SIMPLE SENTENCE
279
4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10
279 283 285 287 288 290 291 293 296 300
The Noun Phrase The Simple Sentence: Word Order Subject and Complement Object Marking Gapping and Ellipsis Verbal Agreement Questions (1): Word Order and Intonation Questions (2): Particles Responses and Exclamations Sentence Adverbs and Enclitics
THE COMPLEX SENTENCE
302
4.11 4.12 4.13 4.14 4.15 4.16 4.17 4.18 4.19
302 306 308 309 311 316 318 321 326
Coordinate Conjuncts: 'and' Disjunctive Conjuncts Parallel Conjuncts Adversative Conjuncts Sentential Complements (1): Ground Rules Sentential Complements (2): Particular Types Miscellaneous Sentential Complements Reporting Speech Serial Verb Coordination
MODAL CONSTRUCTIONS
330
4.20 Necessity and Obligation
330
X
CONTENTS
4.21 4.22 4.23 4.24 4.25
Presumption, Probability, Possibility Ability Volition The Verb Sudan Hortative, Inceptive, and Related Constructions
334 337 340 342 345
SUBORDINATE CLAUSES: PREPOSED
349
4.26 4.27 4.28 4.29 4.30 4.31 4.32 4.33 4.34
349 353 356 358 360 362 364 370 371
General Temporal Clauses (1) Temporal Clauses (2) The Conjunction to Circumstantial Clauses Substitution of ki in Preposed Clauses Adverbial Clauses of Place, Manner, Degree Miscellaneous Adverbial Clauses Concessive Clauses
CONDITIONAL SENTENCES
375
4.35 4.36 4.37 4.38
375 378 379 382
Conditionals Conditionals Conditionals Conditionals
(1): Basic Rules; Possible Conditions (2): Counterfactual (3): Actual Conditions (4): Variations and Idioms
CLAUSES USUALLY POSTPOSED
387
4.39 Temporal and Explanatory Clauses 4.40 Clauses of Result and Purpose 4.41 Postposed Clauses with to
387 389 392
RELATIVE CLAUSES
394
4.42 4.43 4.44 4.45 4.46 4.47
Relative Clauses Relative Clauses Relative Clauses Relative Clauses Relative Clauses Nominalizations
(1): Synopsis (2): Non-Restrictive (3): Restrictive (4): Anomalies (5): Specialized Types
CHAPTER FIVE. Lexis and Sociolinguistics NOMINALS: CONVERSION AND SUFFIXES 5.1 Homonymy and Conversion
394 398 402 405 407 411 415 415 415
CONTENTS
5.2 5.3 5.4 5.5
Suffixes Suffixes Suffixes Suffixes
X]
(1): Main Noun Formatives (2): Other Noun Formatives (3): Main Adjective and Adverb Formatives (4): Other Adjective and Adverb Formatives
418 422 425 429
NOMINALS: PREFIXES AND COMPOUNDS
431
5.6 5.7 5.8 5.9 5.10 5.11 5.12
431 435 436 437 440 441 444
Prefixes Compounds: Determinative Compounds: Possessive Verb-Stem Agentives Stem I Activity Nouns Coordinates and Phrases Reduplication and Expressives
v
VERBS: DERIVATION
446
5.13 Denominal, Factitive, and Transitivizing Verbs 5.14 Causative Verbs (1) 5.15 Causative Verbs (2)
446 448 450
VERBS: COMPOSITION
452
5.16 5.17 5.18 5.19 5.20 5.21
452 457 459 462 467 473
Complex Verbs (1) Complex Verbs (2) Composite Verbs (1) Composite Verbs (2) Conjunct Verbs (1) Conjunct Verbs (2)
SOCIAL AND HISTORICAL NOTES 5.22 Modes of Address 5.23 The Arabic Element 5.24 Lexical Distribution, Persian ~ Tajik 5.25 Uzbek and Turkic Influences 5.26 Russian Influences 5.27 Chronology of Tajik Persian Bibliography Grammatical Index Cyrillic Index Arabic Index
477 477 480 482 484 486 489 493 497 505 513
TABLES AND CHARTS Fig. Fig. Fig. Fig. Fig.
1.2 1.5 1.6 1.10 1.13
Tajik Vowels Consonants Statement Intonation Profiles The Tajik Alphabet: Cyrillic and Latin The Tajik Alphabet: Perso-Arabic
15 23 28 37 46
Fig. 2.7 Fig. 2.27 Fig. 2.29 Fig. 2.49a Fig. 2.49b Fig. 2.54
Definiteness and Specificity in Nouns Personal Pronouns Pronominal Enclitics Numbers 0-19 Numbers 20 - 1000 Months of the Tajik Year
68 108 113 161 162 173
Fig. 3.1 Fig. 3.3a
178
Fig. 3.4 Fig. 3.6a Fig. 3.6b Fig. 3.6c Fig. 3.7a Fig. 3.7b
Tentative Synopsis of the Tajik Tenses Irregular Verb Stems: Infinitive -> Stem I (Present Stem) Irregular Verb Stems: Stem I (Present Stem) -> Infinitive, Cyrillic Irregular Verb Stems: Stem I (Present Stem) —> Infinitive, Perso-Arabic Verb: Personal Endings 'To Be': Enclitic Forms (Present Indicative) 'To Be': Independent Forms (Present Indicative) 'To Be': Negative (Present Indicative) 'To Be': Aorist 'To Be': Simple Past Tense
Fig. 3.9a Fig. 3.9b Fig. 3.10a
Aorist: зистан/зи- -jjüXü"* JJ Aorist: гиристан/гиря- - ^ J S N ^ L L - J ^ S Present Indicative: кардан/ кун-
Fig. 3.10b Fig. 3.10c Fig. 3.12
~ 6 * \ о - О * Чd o > a m d o i n 8 ' Present Indicative: омадан/ о(й)- -(J)! \^±*1 Present Indicative, omadarr: Variants Simple Past: kardam 'I did, made'
Fig. 3.3b Fig. 3.3c
186 190 192 194 200 201 202 203 206 208 209 2 0 9
210 211 213
XIV
TABLES AND CHARTS
Fig. 3.13 Fig. Fig. Fig. Fig. Fig.
3.14 3.15 3.16 3.18 3.19
Fig. 3.22 Fig. 3.23 Fig. 3.24 Fig. 3.25 Fig. 3.26 Fig. 3.27 Fig. 3.28 Fig. 3.31 Fig. 3.32 Fig. 3.33 Fig. 3.40 Fig. 3.42
Fig. 4.7
Imperfect: mekardam 'I was doing, used to do; would do' 214 Definite Future: xoham kard 'I shall do' 216 Perfect Indicative: kardaam 'I have done' 217 Pluperfect Indicative: karda budam 'I had done' 219 Present Progressive: karda istodaam 'I am doing' 224 Past Progressive: karda istoda budam 'I was doing/ making' 225 Non-Witnessed Durative: mekardaast 'he is (evidently) doing/used to do/ will do' 229 Non-Witnessed Pluperfect: karda budaast 'he (evidently) had done' 231 Non-Witnessed Past Progressive: karda istoda budaast 'he was (evidently) doing/ making' 233 Present Subjunctive: kunam '(that) I do/ make' 234 Past Subjunctive: karda bosam 'I might have done' ....236 Durative Past Subjunctive: mekarda bosam 'I might have done/ be doing' 237 Present Progressive Subjunctive: karda istoda bosam 'I may be doing' 239 Past Conjectural: kardagist- l-gi- 'I suppose [he] did; [you] might have done', etc 244 Present-Future Conjectural: mekardagist-/ -gi'[he] might be doing/ about to do', etc 245 Present Progressive Conjectural: karda istodagist-/ -gi- '[he] might be doing', etc 246 Participles: Characteristic Features 261 Participial Quasi-Future Tense: kardaniam 'I am going to do, intend doing' 265 Question Intonation Profiles
293
PREFACE
This work aims to provide quick, easy, and comprehensive access to the grammatical structures of Tajik Persian of Central Asia, as used in writing and educated speech from the early years of the twentieth century onward. The detailed lists of contents and tables, plus three separate indexes, will enable users at any level of competence, whether familiar with the Cyrillic or the Perso-Arabic writing system, to find a particular paradigm or syntagm with illustrations of usage. The range and types of Tajik exemplified and the approach and procedures employed here are described in the first three sections of the Introduction; grammatical terms and abbreviations used are defined in the last three sections. Tajik Persian has been changing rapidly in the past three generations. This is partly a response to natural processes as its speakers come to grips with political and social upheavals; partly due to the influence of Uzbek, Russian and other foreign languages; and in particular the result of two waves of government-sponsored linguistic engineering. It is one of the objects of this grammar to note aspects of these changes, the better to meet the various needs of scholars and students as this remarkable language approaches its centenary (or, from a broader perspective, the fifth decade of its second millennium). My debt to the work of other scholars may be gauged from the Bibliography; Gilbert Lazard, Lutz Rzehak, and Gernot Windfuhr merit particular mention for personal help and encouragement beyond their publications. I am happy to acknowledge an award from the U.S. Department of Education under the Title VI International Research and Studies Program during 2002-03, which enabled me to undertake this project unencumbered by academic duties. Sincere thanks are due to several Tajik friends and colleagues for direct and indirect assistance with grammatical points, notably Gulnora Aminova, Azim Baizoyev, and especially HadiyaNazirova. I am particularly grateful to multiliterate metagrammarian Judith Wilks for her meticulous copy-editing and
XVI
PREFACE
proofing, and for applying a user's perspective to some potentially opaque passages. The expert advice of Brill's editor, and of the anonymous reader, provided a further safety net Any shortcomings in the final product are to be laid at my door alone. Chicago, September 2004
INTRODUCTION History and Actuality Tajik Persian, or Tajik for short {zaboni tojikT, zaboniforsii tojik), is the variety of New Persian used in Tajikistan and parts of Uzbekistan, including the cities of Bukhara and Samarkand. Since the 1920s it has been fostered as the national and literary language of the Soviet Socialist Republic (from 1991, the independent Republic) of Tajikistan. Other designations in English are the older 'Tadzhik" (through Russian, hence the unnecessary trigraph dzh) and the newer "Tajiki" or "Tojiki," which seem almost as un-English in a different way. The Tajik and Iranian Persian speech areas are not contiguous, but lie at opposite ends of a continuum with Persian dialects of Afghanistan in between, and interrupted by areas of Turkic (Turkmen and Uzbek) speech. Spoken Tajik has been evolving independently of Persian of Iran since at least the sixteenth century, but the written language (which functioned as the common language of high culture, government and diplomacy in Iran, Central Asia, and India) maintained a near-universal standard, based on Classical Persian, until the early decades of the twentieth century (see Chronology, 5.26). In the Soviet period, with the promotion of a more vernacular style and lexicon and the systematic introduction of Russian loanwords, language change was more rapid. The writing system was switched from Arabic to Latin (on regular phonemic principles) in 1928, then to Cyrillic (following Russian-specific rules) in 1939. The modern literary language {zaboni adabii hozirai tojik), as planned and exemplified by Soviet Tajik writers of the 1920s and 1930s, was based loosely on the style of the old cultural center, Bukhara; it contained many Uzbek loanwords and some syntactic structures calqued on Uzbek usage. In recent decades writers from different regions of Tajikistan, and some who have traveled abroad, have been introducing a more varied style, including features closer to Persian of Iran. Literacy is now defined not as the eradication of a previous script
2
INTRODUCTION
and its replacement by one more ideologically correct, but as education in the current revised Cyrillic, plus the Perso-Arabic of the Classics and the neighboring Persian-speaking states, plus the Latin of the Western world. Post-modern Tajik is still in transition. Apart from the vacillations in orthography and style which can be seen in literature and the press, there are undoubtedly shifts in pronunciation and idiom underway as demographic fluctuations change the composition of urban populations and the nature of interregional links. In the space of a mere two generations, Tajik has been one of the most consciously, intensively, and rapidly "planned" languages ever— both at the stage of Russianization (late 1920s to 1950s) and again during re-Persianization (late 1980s onward). It has three complementary identities: as a linear descendant of the conservative literary standard historically dominant in the region (Classical Persian); as a distinct modern written variety of international Persian, closely related to modern Persian of Iran (färsi) and of Afghanistan (dari)\ and as a cluster of regional dialects, of which the Northern group is strongly Turkicized (see 5.24). Given such a complex history and politico-cultural economy, and a future again on the drawing board, what should a comprehensive current grammar of this language aspire to be? There are available a number of Tajik grammars of limited scope for specialized readerships. Rastorgueva's "sketch" (1954) is an excellent short (Russian-style and Russocentric) linguistic description of Modern Literary Tajik (MLT) as conceived and nurtured up to its zenith. The three-volume grammar of Rustamov et al (in Tajik, completed in 1989, on the eve of the dissolution of the USSR) profits from six decades of Tajik writing to provide a wealth of examples of all stages of MLT, but like all committee products it is uneven in theory and exposition. Studies by foreign linguists from the perspective of Iranian or Afghan Persian (Birnbaum, Farhadi, Lazard, Raja'i) fill in some of the blanks in terms of historical and regional variations. A new generation of teaching manuals for the post-Soviet language, presuming no collateral language experience, is already in action (e.g., Rzehak 1999, Baizoyev and Hayward 2004). Each type may claim to be a grammar of Tajik to some extent, but none is truly comprehensive.
INTRODUCTION
3
Colloquial and Dialect Usage Tajik dialectology is too large (and incomplete) a topic to attempt to summarize in what is primarily a grammar of the written language. However, the written language as codified during the 1920s-1930s was explicitly based upon a particular dialect group, and socialist ideology consciously privileged vernacular usage in general over what was seen as emulation of calcified Classical models by a tiny literate elite. The following observations are confined to such aspects of regional dialect and spoken usage as have palpably affected the written language, and as such may be mentioned in passing in the Grammar. Tajik dialects may be divided broadly into two groups: Northwestern and Southeastern, corresponding in rough topographical terms to the lowlands and highlands respectively of the Oxus basin. Several refinements of this scheme have been proposed, and much fieldwork remains to be done. The scheme adopted here distinguishes four groups: (1) Northern, comprising Bukhara, Samarkand, and Derbend in Uzbekistan, the Ferghana Valley (including Khujand), and extending down the Varzob valley to the region of the capital, Dushanbe. (2) Central, comprising the upper Zarafshon valley. (3) Southern, stretching south and east of the capital, in Kulob and Qarotigin regions, including Gharm, as far as the Pamirs. (4) Southeastern, in Mountain Badakhshan and adjacent areas. Only the first three, which have been to an extent exemplified in literature, are referred to in the Grammar (chiefly in respect of variants in verb paradigms). Northern dialects have been influenced to varying degrees by Uzbek, with which there is widespread bilingualism (5.24). A distinctive subvariety of Northern Tajik speech, with its own literature, is the JudeoPersian of the Bukhara Jews, most of whom have emigrated. Some Southern and Southeastern dialects have strong affinities with those of the left bank of the upper Oxus in Afghanistan's province of Badakhshan. Tajik is also the contact vernacular (called forsT) of Mountain Badakhshan, extending into Afghanistan and Wakhan. In these regions the mother tongue of a majority of the population is one of the Eastern Iranian dialects of the Pamir group, related distantly to Tajik Persian
4
INTRODUCTION
but far from mutually comprehensible with it. The so-called Tajiks of southwest Xinjiang, in and around Tashqurghon, are speakers of the Pamir languages Sarikoli and Wakhi, not Persian. It was in fact a literary variety of Northern speech, not a transcription of the vernacular, that came to form the basis of modern Tajik Persian. This special language variety, devised for educational purposes, was an invention of the Jadids of the Bukhara emirate, reformists such as Abdulvohid Munzim, Abdurauf Fitrat, and Sadriddin Ayni (Aini). At the dawn of the twentieth century, these men founded modern schools for Persian-speaking youth and devised a practical form of Persian in which to teach a modern curriculum. From the outset, both script and style were a compromise: the Perso-Arabic of the primers varied the traditional spelling to accommodate local pronunciation, and madrasainspired catechisms were wrapped in the near-Uzbek syntax of village speech. But its origins in Perso-Arabic script ensured that, even in its later Latin and Cyrillic versions, this language could be read as a variety of literary Persian, and not as a transcription of one or the other local dialects: not as, e.g., /dassota ti:t/ for c give [me] your hands', but dast-ho-yat-ro
diked
,\\&j
\j C J J L A О • .»J.
Dushanbe was a small market town before its promotion to national capital in 1924 and the consequent influx of Tajiks from elsewhere in the region, of Russians and other Soviet nationalities, and above all of the Bukharan literary elite (since Bukhara and Samarkand were allotted to Uzbekistan). Though nominally included in the Northern dialect area, this instant metropolis for long lacked a stable demographic through which to exercise its linguistic status. Since independence it is again in a state of demographic fluctuation, the home of writers and speakers from other regions with other styles.
Purpose and Procedures The variety of Tajik described here is for the most part that of the bulk of the extant literature, Modern Literary Tajik (MLT) of the Soviet era, with the beginnings of lexical and stylistic reform as undertaken from the late 1980s. Quotations retain the original spelling, but most of the material follows the orthographic reforms of 1998. In order to
INTRODUCTION
5
balance the needs of various users—the historical and descriptive linguist, the reader of Soviet-era sources and literature, and the student and teacher of contemporary Tajik—the present work adopts the view that a historically-informed grammar of a language barely eighty years old can coexist with a grammar of the contemporary idiom as it evolves. This perspective is reinforced by the Iranist view that the language is not eighty, but actually over a thousand, years old, and is now renewing old family connections that were obscured, but not severed, during the past century. The Grammar thus aims to furnish a comprehensive reference to the structures of written Tajik Persian from the heady days of international socialist idealism in the 1920s, through the rise, stagnation, and fall of Russian communism, into the independence of the twenty-first century. It therefore includes a grammar of essential aspects of Persian at large, which remain at the core of Tajik, and an explanation of the Uzbek- and Russian-influenced aspects of the syntax and lexicon that contribute to the uniqueness of Tajik Persian. To serve readers of Persian who may not need to cope with the Cyrillic writing system, every literary example (and each index reference) is presented in Perso-Arabic script as well as Cyrillic; all but a few dialect citations in Roman transcription are also normalized in both Tajik scripts. The Perso-Arabic spelling of some Russian loanwords (since they have never been, and may never be, written in the Arabic alphabet) is arbitrary, and that of some recent foreign borrowings is not yet standardized; where necessary, examples are also given in transliteration and phonemic transcription. The system used is set out in Sections 1.9-10, where the writing systems of Tajik are explained. Transcription of Standard Persian, where used for comparison, differs from that used for Tajik in the representation of the vowels (see 1.10). To gloss the examples, I have chosen idiomatic rather than literal translation, which may be followed in parentheses by a closer gloss (see Conventional Signs, below). Primary stress (1.6) is indicated where necessary by an acute accent on the vowel of the stressed syllable, and secondary stress by a grave accent. These may appear in transcription, transliteration, or Cyrillic text; in the latter case, it should be remembered that they are not part of the original orthography. Italics are used in
6
INTRODUCTION
Latin script only, not in Cyrillic (see 1.10). Words in Cyrillic are presented with their morphemes separated by short hyphens whenever this is judged helpful. However, use of this device is kept to a minimum, since it obscures norms of Cyrillic orthography such as the dropping of the macron from final -й before an affix (1.12), and the general tendency to write words maximally defined (including affixes and auxiliaries) as a single unit (1.16). The reader should assume that hyphenated Cyrillic words are normally written as one, unless a longer hyphen, the n-dash, is used: this indicates that the word is normally so hyphenated in Cyrillic (or Perso-Arabic). Systematic morpheme-separation is not practicable in Perso-Arabic script. Grammatical and linguistic terms used are, so far as possible, limited to the conventional and generally known. Those which may have another or more general meaning are capitalized when used in their specialized sense. The Tajik grammatical terms are not used; significant ones will be noted in passing. Any terms not in common use, or used differently in the present work, will be glossed where introduced, or may be found below under Definitions. One distinction preserved here is that between form and function. It would be misleading, for example, to call the English verb form ending in -ing "the present participle" as a label of identity, since it functions not only as the present (or active) participle (we are going there), but also in a different nominal category as the gerund or activity noun (go while the going is good). Similarly there are identical verb forms in Tajik with more than one category and function. Thus the Imperfect tense (me-kard-am, etc.) will be described in a single paradigm, but contextually illustrated as a Durative Past or a Conditional (cf. the term Aorist below, under Definitions; this also may have three functions). Definitions Aorist. In Tajik, the finite verb form consisting of the (present) stem and personal endings, without any prefix; corresponds usually to the Present Subjunctive, but in some verbs to the Present Indicative, and in one an auxiliary. Aspect. A way of viewing an action or event, e.g., as being accomplished
INTRODUCTION
7
at once (punctual: the stone fell), as being in progress without regard to its completion (progressive: he is swimming), as happening habitually or repeatedly (habitual, iterative: we used to go swimming). Aspect is independent of the time at which a verb records the action as taking place (tense), but tense and aspect (as well as mood and voice, q.v.) combine to encode an action in a standard form, the "tense" in its everyday use and as presented in six-person paradigms. Classical Persian. The literary form of Persian exemplified in texts from the 11th century CE, and used in some contexts and genres until modern times. Complex, Composite, Compound. Complex verbs are those consisting of a simple verb and a preverbal particle (cf. English phrasal verbs; 5.15). Composite verbs are those comprising a simple verb plus a noun, adjective, or other lexical component (5.17). The term Compound is reserved for nominals and for tenses of verbs with more than one part. Enclitic. A grammatical unit attached to the end of a word, clarifying a syntactic relation; English possessive s in its, copulative s in it's (= it is) are (different) enclitics. In Tajik, enclitics do not carry stress (contrast Suffix). Explicit plural. A form of the verb ending or pronoun referring to an actual plurality of persons, as distinct from a plural form referring politely to a singular addressee. Formative. A morph (q.v.) which when added (typically as a prefix or suffix) to a word or a stem forms a new word of a particular class: -edin English is the usual Past Tense formative. Izofat, Split. An adjectival noun phrase in which the Indefinite/ NonSpecific enclitic -e is added to the head noun, as in kas-e digar 'someone else' (2.11). Izofat, Mute. A Nominal or Adjectival izofat in which the connecting enclitic -i is not pronounced or written, as in sohib-mansab 'officeholder' (5.10). Mirative. A function of the Non-Witnessed mode: the speaker is unexpectedly aware of a situation or suddenly appreciates its significance (3.21).
8
INTRODUCTION
Modern Literary Tajik (zaboni adabii hozirai tojik), MLT. Literary Tajik Persian as codified during and after the 1930s under Soviet direction, and exemplified in the works of such as Sadriddin Aini, Rahim Hoshim, Jalol Ikromi, and Sotim Ulughzoda. It is characterized by vernacular constructions, particularly of the Northern dialects, such as the idiomatic use of participles, and in later works by the incorporation of Russian vocabulary. MLT is described in the three-volume grammar by Rustamov et al., published by the Tajikistan Academy of Sciences in 1985-89. It began to give way during the Perestroika period of the late 1980s to a less regimented style open to influences from a broader dialect range and Persian of Iran. Mode. Applied here to the verbal category of Non-Witnessed action (Taj. siga-i naqti, 3.21-24; also called the evidential, or non-evident, mode or viewpoint). This is an epistemic set of the Indicative mood, indicating by tense form that the information conveyed was obtained not by direct observation but through collateral sources, as hearsay, inference, or sudden realization. The term 'mode' is also applied to verbal constructions and particular verbs (modal auxiliaries) expressing ability, obligation, potential, etc. Mood. In this grammar, applied to the traditional verbal categories of Indicative (the unmarked set of tenses expressing unqualified statements and questions); Subjunctive (the set expressing contingent or unreal actions), and its related or subsidiary modes of Prohibitive, Optative, Precative, and Imperative (though traditionally this last is classed as a separate mood); and the Tajik category of the Conjectural (3.30-33), a set of three tenses expressing an unsupported presumption of the action. Morph. A significant lexical or grammatical unit smaller than a word, which does not necessarily have an independent lexical meaning (lex, ic-al, and ing are morphs: one a nominal stem, one a complex adjectival formative, one a gerundial or participial suffix). Noun Phrase. Used here in its broadest sense, a nominal (noun or pronoun) together with its typical adjuncts, such as plural suffix, article, determiner, adjective or other modifiers (e.g., a bucket of
green paint; these strange men in the bedroom)—seen as a
INTRODUCTION
9
component of a sentence, usually as subject, object, or complement (cf. VP). Quasi-passive. In Tajik Persian, an intransitive composite verb form with an auxiliary such b&yoftan 'to receive' orxurdan 'to undergo', correlating with a transitive verb using an auxiliary such as dodan 'to provide' oizadan 'to inflict' (5.18). Quasi-tense. A verbal construction in which a participle and an auxiliary combine to express a particular aspect-time; it differs from a recognized tense in that the auxiliary may take different tenses, or other auxiliaries may be used (3.20, 3.42). Quotative Past, English. The past tense in English sentential complements of speech reported, and events perceived or experienced, as in she said that she was sick and would not come (actual words: "I am sick, and will not come"), or / realized they were coming to get me (actual perception: "they are coming to get me"). It arises from a sequence-of-tense rule that views the event from the time frame of the reporter, and copies the tense used to record it subsequent to the event (the past, or past future). This is at odds with Tajik usage, which usually views the event from the time frame of the participant and copies the tense used in the actual utterance, or which would have been used had the participant voiced the experience or commented on his or her perception at the time (the present, or present future). Since the present English glosses of Tajik sentences aim to be idiomatic, it will occasionally be necessary to draw attention to this idiosyncrasy of English in order not to confuse the discussion of Tajik tense use. Example: mardi hezumkas did ki dar yak jo hezumjam' suda xobida ast 'the woodcutter saw that the firewood had been gathered together and was lying in one place'. (The English "quotative past" disguises in translation the true present time of xobidaast 'is lying'.) Sentential pronoun. A Tajik demonstrative pronoun serving as a prop for a preposition or conjunction, and referring forward to the sentential complement; e.g., Zaynab sod budaz on, ki vay bo Muxtor sarik ast 'Zaynab was happy to be a partner of Mukhtor' ('...happy from the fact that she is a partner...').
10
Standard Persian. Iran.
INTRODUCTION
Modern literary and educated spoken Persian of
Speculative simile. Also called a 'comparative clause', this is a phrase or clause introduced by the expression 'as i f (Taj. гӯё L ^ guyd). Suffix. A formative, usually lexical or semantic (e.g., of the plural), added to the end of a word; in Tajik, a suffix is stressed (contrast Enclitic). Voice. Whether a verb is Active (John saw Mary) or Passive (Mary was seen [by John]). Verb Phrase. A phrase of which the head is a verb. In its broadest sense, the verb and all its adjuncts (preverbs, adverbs or adverbial phrases, and any object or complement), as a component of a sentence, distinct from any noun phrases constituting the subject. More narrowly, the verb alone, or the VP excluding the object or complement NP; thus, thought up a solution immediately, or thought up a solution, or thought up may each be treated as a VP, according to the kind of analysis required. Word order. The acceptable sequence of the constituents of a phrase or sentence. The main constituents are abbreviated as S (subject), О (object), V (verb), which include the extensions into NP and VP. Other constituents are Adv. (adverbial phrase) and Prep, (prepositional phrase). Conventional Signs Italics are used in Latin characters for transliteration from Cyrillic (obed 'lunch'), for transcription from Perso-Arabic or dialect (Taj. bosed, Per. bäsid), and to cite a word in any language as a linguistic example (Eng. doing). [ ] In Tajik paradigms, syntagms, or examples, brackets enclose variables of the same category, e.g., nouns, pronouns, or verb stems, any of which are subject to the same rule or structure; also used to cite paradigms in the Indexes. In English glosses they enclose literal versions (inside single quotes), or material that is useful for an idiomatic English translation, but which does not appear in the original.
INTRODUCTION
< >
In English glosses, parentheses enclose optional or supplementary words or phrases, or material that appears in the original, but may not be essential to the translation. Braces are used hierarchically (in Cyrillic and Perso-Arabic examples) to separate or enclose nested phrases or clauses, the better to illustrate sentence structure. A slash separates alternatives. In English glosses, alternative translations are divided by phrase or clause. Slant lines enclose phonemic transcription; e.g., /abyet/ 'lunch'. Precedes a normalization, in Cyrillic and Perso-Arabic, of a transcription of oral material in dialect or colloquial speech. (Derived) from, originating in Becomes, changes to, generates An asterisk preceding a word or phrase means it is not used, or is grammatically unacceptable in this form or context.
Abbreviations (Def. imeans See the Definition in the list above) adj. adv.
adjective
adverb Ar. Arabic colloq. colloquial cons. consonant
CP
Classical Persian (Def.)
Cyr. dial.
Cyrillic dialect
Eng.
English especially
esp.
11
Fr. French German Ger. Imper. Imperative intr. intransitive lit. literal(ly), before a gloss; literary (stylistic register)
MLT Modern Literary Tajik (Def.) nom.
NP obj.
Piprep. prov. Rus.
sg.
sov SP
Tab tr. Uz. var.
VP
nominal Noun Phrase (Def.) (direct) object plural preposition(al) proverb or catchphrase Russian singular see Word order (Def.) Standard Persian (Def.) Tajik transitive Uzbek variant Verb Phrase (Def.)
CHAPTER ONE PHONOLOGY AND ORTHOGRAPHY 1.1
Integration of Sound and Script
The fundamentals of the sound system of Tajik Persian (1.2-8) may in theory be appreciated without reference to either of the two principal writing systems, which are expounded in Sections 1.9-14. The three parts may be approached independently, hence examples will be provided in one or more forms (transcription and/ or transliteration, Cyrillic, Perso-Arabic) only to the extent necessary for the particular illustration. However, the beginner is advised to refer forward to Tables 1.10 and 1.13, the better to appreciate how the Cyrillic and the Perso-Arabic scripts interact. By way of preparation, these and the other writing systems of Tajik Persian may be characterized briefly as follows. Perso-Arabic script is superabundant in consonants, and deficient in vowels. It retains eight redundant Arabic-specific consonants, while neglecting to mark three short (or "unstable") vowels (except for occasional diacritics; see 1.13). In the case of short /i/ this deficiency obscures the presence or absence of the grammatical izofat (2.10). The system also fails to distinguish /u/ from /ü/ (writing either of these indiscriminately with j or with nothing) and /i:/ from Id (writing both with fj); this latter ambiguity also has morphemic importance wordfinally, where it involves three suffixes in -i and an enclitic in -e (1.13). The script tends to minimize ambiguity and homonymy in Arabic loanwords, while maximizing them for the Persian, Russian, and other vocabulary. Cyrillic has four redundant characters (the yotated vowels, which represent syllables rather than single phonemes; 1.12) and, unless one accepts the absence of long vowels (1.2), it fails to distinguish the two vowel pairs /i/ from /i:/ and /u/ from /u:/. It has two ambivalent semivowels, e and и (in addition to their post-consonantal values as Id and /i/, after vowels they represent /ye/ and /yi/). It tends to minimize
14
CHAPTER ONE
obscurity and homonymy in Persian, Russian, and other non-Arabic vocabulary, while maximizing them for Arabic loanwords. The modified Latin alphabet that was in use briefly during 1928-1940 proved capable of representing simply and unambiguously the sounds and structures of Tajik Persian (1.9). Even this was hobbled by a decision to ignore the distinction between the two pairs of long and short vowels and, like Cyrillic, it had to tolerate a high ratio of homonymy for Arabic loanwords. Hebrew script, used by the Jews of Bukhara and Samarkand, applied a more explicit vowel system than Arabic and was to an extent a fair compromise between phonographic and etymological spelling; however, as the system of a religious minority, it was impractical to extend its scope. Cyrillic, without the flexibility and precedent of Latin as a neutral system of notation, remains confusingly Russian-specific. Perso-Arabic, for all its etymological spelling and vowel deficiency (which can at a pinch be circumvented with diacritics), enjoys the cultural advantage of displaying in one and the same orthography the common vocabulary of its traditional kulturbund (as do, for example, English and French, with a comparable degree of disconnection between spelling and speech). There is no universally satisfactory solution; within its limitations, each of the systems displays considerable ingenuity in representing a language which defeats the scholar's best efforts to craft even a consistent transcription-cum-transliteration. Homonymy in Tajik, through phonetic and/ or orthographic coincidence of unrelated words, is more frequent than in SP, for two reasons: (1) The merging of former long and short vowel pairs in some dialects tended to remove a disambiguative contrast between, e.g., бино I ' ; ; /biino/ 'sighted, seeing' and бино Li_» /bino/ 'building' (see 1.2); with the change to Latin and then Cyrillic orthographies designed for Tajik on the basis of vernacular pronunciation, this vowel merger was generally fixed in the written form too. (2) Application of a phonographic script (Latin, Cyrillic) removed some distinctions afforded by different (Arabic) consonants for the same sounds in Tajik, as in j " ... 'concealment, veil'; j b ... 'line (of writing)' (bothcaTp /satr/), or ci>jLJ 'command, emirate'; c^jLo-c 'cultivation; building' (both иморат /imorat/). This involves mainly literary vocab-
PHONOLOGY AND ORTHOGRAPHY
15
ulary. Nevertheless, dictionaries of homonyms are quite useful in Cyrillic-script Tajik; a recent one (M. Назриева, Луғати омонимҳои забони тоҷикй, 1992) runs to 240 pages and some 2,000 entries.
PHONOLOGY 1.2
Vowels (J): Stable and Unstable
According to the canons of Modern Literary Tajik as established in the 1930s, there are six vowel phonemes in standard Tajik Persian, articulated as follows (those boxed are the so-called "unstable" vowels; see below). FIG. 1.2
TAJIK VOWELS
Front
Central
Back
Mid
e
ü
о
Low
Ia I
In both Standard Persian (SP) and Tajik the eight-vowel inventory of Middle and Early New Persian has been reduced to six, but in quite different ways. In Persian of Iran the two long mid vowels /e:7and /o/ (the so-called majhul 'unfamiliar', i.e., non-Arabic, vowels, as in CP ser j : - 'lion' and röz jjj 'day') collapsed with the long high vowels /i:7 and /u:/ (as in sir j j л. 'milk' and rud j j j 'river*); whereas in some varieties of Central Asian Persian, length was neutralized by the merger of the short and long high vowels and the rounding of long back /ä:/ in the direction of/o/, as in Fig. 1.1. Thus Tajik /e/ and /ii/ are the successors of the old majhul long vowels; /i/ and /u/ are the continuation of the Classical ma'ruf or Persian long vowels,but they additionally represent the corresponding short vowels (as indil J J 'heart, stomach' and but c^-> 'idol'); and /o/ is the continuation of the long back vowel /ä:/, as in CP and SP ^ b L 'almond' (bädäm in the usual transliteration). The asymmetrical position of /ü/ is due to its having merged with Uzbek /ii/ (orig. common Turkic vowels /ö/ and /ii/); thus the same vowel appears in Tajik borrowings from Uzbek, e.g., kwnak 'help',
16
CHAPTER ONE
kürpa 'quilt' (see also 1.14). The Classical opposition of long vs. short vowels has been preserved as the basis of the literary prosodic system (aruz\ see below), but is no longer fully applicable to the spoken varieties of either Iranian or Central Asian Persian. It has been argued that the phonemic contrast of length has been replaced in both dialects by a contrast between stable and unstable vowels.1 The stable vowels of Tajik, which are phonetically invariant, are the mid (half-close) vowels /e/, /ii/ and /o/ (unboxed in Fig. LI); these do not change appreciably in length or quality in any position. The unstable vowels, in which length and quality of articulation may vary according to the phonetic environment, are /i/, /u/ and /a/. Thus in stressed position and unstressed closed syllables (CVCC), the unstable vowels are equivalent in length to the three stable vowels: /panfr/ 'cheese', /mizgon/ 'eyelashes'; IA"mgl 'lie, untruth', /duxtar/ 'girl'; /d'gar/ 'other', /haätod/ 'eighty', as in the syllables underlined. In unstressed open syllables (seen also in three of the words above), they may be shortened and reduced to a schwa Ы or elided. Further examples: /did/ '(s)he saw', /d'mog/ 'nose'; /dud/ 'smoke', /gud6z/ 'melting'; /bad/ 'bad', /tfdän/ 'body'. "May be" does not mean "must be," and in fact lexis and morphology still trump phonology. Thus дидор J I J ^ J /diidor/ 'meeting, visit; countenance' retains a long (or "stable vowel equivalent") first syllable because it is a derivative of the verb Stem II did- 'see'. This is too significant a segment to be reduced simply because it happens to fall in an open syllable before a stressed stable vowel; unlike, e.g., бидон <jlju /bidon/ 'know!', where the first syllable is stressed, as befits an Imperative (3.29), but canonically short, to retain its identity as that particular prefix. Similarly, мушак , < • MJ-^O /muisäk/ 'rocket, missile' (a diminutive of mwS 'mouse, rat') contrasts, by virtue of the lexical stem, with мушир j _* •*. * /musiir/ or /mu§i:r/ 'counselor' (an unanalyzable Arabic participle). These contexts where /i/ and /u/ remain stable despite falling in an unstressed open syllable correspond, of course, to syllables where they are written with^ and j in Perso-Arabic script, in accordance with morphology and etymology. 1 See, for Persian, Lazard 1957/ 1992, § 7; for Tajik, Rastorgueva 1953, pp. 6768/ 1963, p. 4. For a dissenting view, see Mirzozoda 1994.
PHONOLOGY AND ORTHOGRAPHY
17
This ambiguity of /i/ and /u/ is resolved in speech. The contrast in quality is not as distinct as the corresponding contrasts in Standard Persian, but in general "long" /i/ and /u/ in stable position or stableequivalent context are perceived in Tajik as being higher, and tense, in comparison with the unstable or "short," generally lax, vowels—if not always longer in terms of milliseconds. The oversimplification implicit in the ostensibly phonemic Cyrillic writing system thus causes some confusion. In several (Cyrillic) homographs, this same contrast, of length in the older system, is shown in the Perso-Arabic orthography, though not in the Cyrillic: пул j l /pul/ 'bridge' vs. пул Jj_> /pu:l/ 'money' (similar to Eng. pull and pool); бино LLJ /bino/ 'building, basis' vs. бино Ц ш /bi:no/ 'sighted, able to see'; cf. [US] Eng. to debark (a ship), vs. to debark (a dog). It is also responsible for some common spelling errors in Cyrillic (see 1.12). The single concession made to this contrast in Cyrillic is between word-final short, unstressed /i/ and long, stressed /i:/ (cf. Eng. trusty and trustee). Since in (Tajik) Persian this contrast is grammatically significant, it is shown by means of a diacritic macron over the character: и/ ü(see 1.12). An unintended consequence of the dropping of vowel length as a feature of the written language was the obscuring of the traditional prosody (aruz ^ J J — c ) , which determines long and short syllables in traditional verse. Poets of the second and later generations of MLT (such as Mirzo Tursunzoda), who had not learned to read Perso-Arabic script, unwittingly composed verses that do not scan, and rhymes (such as чогир j ^ S L ^ /jogi:r/ with ҳозир j^»U. /hozir/) that do not work. 1.3
Vowels (2): Individual Qualities
The central and back vowels are rounded. /u/ (orthographically у j - ; unstable) is close to the cardinal vowel and Eng. pull or pool (cf. 1.1; and see Lowering, below). Long and short vowels, or unstressed open syllables (where the vowel is reduced in length and quality), can be distinguished by the PersoArabic spelling, but not by the Cyrillic: буд jj_> /bu:d/ 'was', шуд ^ll/sud/ 'became'; хунин ^J»j-^/xu:nin/ 'bloody', чунин
18
CHAPTER ONE
'such'; муздур j j j > /muzduir/ (<muzd-var) 'wageеатег',шутур j " 11t/sutur/ 'camel', сутур j j " ... /s^ur/ 'draught or riding animal'. /п/ (у j l ; stable), lies phonetically between [u] and [y], i.e. halfway to the Umlaut, a little lower than Eng. good as pronounced in lowland Scottish (imitated spelling, guid), but higher than French peu. It is less rounded and more lax than /u/. This vowel is phonemic only in Northern dialects; in Central and Southern speech it is generally replaced by (stable) /u/ (see also 1.11): кӯҳна * \ц < /kühna/ or /kuhna/ 'old' (see also Lowering, below). /о/ (о, alif in Perso-Arabic; stable) is lower (more open) than Russian o, somewhat as in Eng. awful, and uniform in quality throughout its length, without either the u- onset of Russian or the -w offglide of English in some environments. It is the most stable vowel, prosodically long, but of consistent and unique quality in any environment, so that length is irrelevant as a distinctive feature (unlike the other two stable vowels). It may be nasalized before syllable-final /n/: он ҷо I? \~\onjo /qjo/ 'there'. In some Southern dialects it may be less rounded and closer to SP /ä/ [a], to which it corresponds historically: обод J L J /obod/.../äbäd/ 'fertile, prosperous, inhabited'. The fact that this sound approximates Russian о more closely than it does any other Russian vowel, and is thus transcribed by о to and from Cyrillic, has three unfortunate consequences. It obscures the historical affinity between /o/ as ä and (unstable) /a/ (they were prosodic pairs, /a/ and /a:/, as were CP /i/ and /i:/, /u/ and /u:/—see 1.8); it disguises the morphological identity of Tajik words with cognates in other varieties of Persian, e.g., boridan 'to rain' with SPbäridan ü ^ j L » (not boridan o ^ j - e ' t o C U O ; and it prompts flagrant mispronunciation by non-Russian speakers learning Tajik. There is a prime analogy on all three counts with the 19th-century transcription of Indo-Persian short a (which is more close than in Persian) by English и (then pronounced [л] as in but or cup), which duly generated oddities like cummerbund for /kamarband/ J_1JJ_A£ 'waist sash' and Punjab for /panjäb/ ^ 1 ^ "ij (though without having these replace the Perso-Arabic represent-
PHONOLOGY AND ORTHOGRAPHY
19
ation for Indians). The front vowels form a more regular pattern in sound and transcription. Care should be taken not to confuse the values, or transcriptions, of /i/ and Id with their SP counterparts, which are in effect reversed; thus the pen-name of the Indo-Persian poet бедил J ^ /beidfl/ ('heartless'), as he is known in Central and South Asia, is pronounced by Iranians more like /birdel/. /i/ (и, й 'feeling, sense', too, is often found in geminate form. Other common examples are as follows (given in Perso-Arabic script only where they contain non-default characters; see 1.13): dur /dur/ 'pearl', durri noyob /durri noyob/ 'rare pearl'; kul 'all, whole', kulli mardum 'all the people'; sir '(the) secret', sirre 'a secret';xat .LA'line; writing', xatti arabi ^ j - ^
L
^ 'Arabic script';
hob c-A 'pill, tablet', habbi sulfa < 51 ,„ L^'cough tablet'; had j-a. 'limit, boundary', haddi aksar j - l ^ l J ^ . 'maximum'; haq j ^ 'truth, right', dar haqqi
J ^ j j 'concerning'; hal J ^ 'solution',
- < - *> ^j.1^ Jja. halkunii namak 'dissolving salt, salt solution', but ^1 «ü_l ^u о J^a. /гаШ mas'alae 'a solution to the problem';fan 'technique, science', fannl 'scientific', etc. Geminates are represented in Cyrillic, and optionally (with tasdid\ see 1.12) in Perso-Arabic. It is not incorrect to write tasdid over a singular or pre-consonantal form: ^->bL» ^j " * — , etc. A few such loanwords are assimilated to the extent that they have lost gemination entirely: kafi dost 'palm (of the hand)', kafi po 'sole (of the foot)'. The word қад j
2 'size, stature, height,
length' vacillates: qaddi rasol qadi raso 'full size', qad-u bar 'length and width'; in its use as a prepositional phrase it generally has a single IAI: qad(d)i кпЫ 'along the street' (2.21). In a particular idiom, the Persian noun худ JJ-L 'self (2.32), which does not end in a geminate, often has a reduplicated /d/ before izofat: худди ту барин /xuddi tu barin/ 'just like you' ('like you yourself; see term, 2.23). Prosodic pairs. In accordance with the older system of vowel opposition in Persian (1.2), the five long vowels were paired with three short
PHONOLOGY AND ORTHOGRAPHY
33
vowels, ä : a, lie : /, й/ö : и for purposes of contrasting long and short syllables in poetry. By poetic license, a short vowel might be substituted for its (canonical) long equivalent in some frequentlyoccurring words. The last two contrasts provide only ad hoc prosodic variants in verse, and cannot always be represented in Cyrillic, since it does not distinguish long and short vowels; e.g., niko : nikö (in Tajik, некӯ j < y - : накӯ ^£J») 'good'; büd : bud : jj_» (л-», both буд in Cyrillic). However, a for ä is still to be found in lexical forms such as names and compounds: раздор
JIJ-*J
rah-dor 'striped, streaky'
('having stripe(s)' < roh 'road, stripe'); a half-dozen other compounds have roh- by preference, rah- as variant, e.g., rohbarl rahbar 'leader', rohzan/ rahzan 'highway robber'; mahtob 'moonlight', mahtobi 'moonlit' (< moh 'moon' + Stem I toftan 'to shine'); sahnoy 'wind instrument' (<soh 'king'; an augmentive, 5.6; for noy, see next). The variation even provides a few lexical doublets: nay 'reed, cane, pipe, flute'; noy is the poetical variant in these senses, but cf. the diminutives noyca 'windpipe, bronchial tube; blowpipe; weaver's reed' vs. nayca 'sprout, shoot; reel, spool; (cartridge) case, shell'.
ORTHOGRAPHY 1.9
Writing Systems: Introduction
New Persian as a literary language evolved at the court of Bukhara more than a millennium ago, by adapting Arabic loanwords and the Arabic alphabet to spoken Middle Persian. Since then other literary centers, chiefly on the Iranian plateau, have dominated its development, with only grudging and belated regard to changes in the spoken idiom of the center and none for those of the periphery, as is the way of literary languages. The social and political upheavals of the early twentieth century prompted a revival of literary Persian at Bukhara and its cultural hinterland, again based on the spoken vernacular (by now
34
CHAPTER ONE
rather different from both its predecessor and that of its neighbors in Iran), and again adapting the lexicon and writing system of a new ideology to existing norms. Thus was born Tajik Persian (see Chronology, 5.27). Toward the end of the 20th century, the collapse of this ideology and its imperial network brought about a mixed reaction: a revulsion from a culturally invasive colonial tongue, balanced by a need to maintain the machinery of everyday administration; a reversion to classical ideals half-lost in the course of centuries, coupled with a resolve to participate in both the wider Persian-speaking (and -writing) world, and the world at large, that for so long had been closed to Tajikistan. In these deliberations, language and script have played an important role in both symbolic and practical areas. The pendulum is still in motion: the language hovers between Sovietized past and re-Persianizing future, the writing system shared unequally between an established (though reformed) Cyrillic, a re-emerging Perso-Arabic, and the universal auxiliary Latin alphabet. All these tendencies are of necessity reflected in the present Grammar. The Tajik Latin alphabet. Used briefly between 1928 and 1940, this was a highly successful adaptation of the Latin alphabet as a scientific, international system of transcription, applicable in principle to any language, with the aid of a few extra diacritics. It was in origin a by-product of the work of Turcophone intellectuals of Baku and Kazan, who hoped to develop a unified Turkic alphabet; this hope was frustrated by political more than linguistic obstacles, and the Latin alphabets as adopted in Turkey and in Turkestan in 1928 were slightly, and unnecessarily, different. The Tajik version was phonographically consistent (i.e., one character corresponded to one sound of the language, at least in the case of consonants), and betrayed its underlying Russian matrix only in the alphabetical order of the letters and the use of о for the "long a" represented by alif in Perso-Arabic script (1.3). As this system is listed here (see Fig. 1.10), the three digraphs and extra e in parentheses are merely to show the correspondence with later Cyrillic characters that are redundant in Tajik. The early drafts of the Latin alphabet contained only lower-case letters, the modernist argument being that capitals were superfluous; in this form a range of
PHONOLOGY AND ORTHOGRAPHY
35
Tajik publications was printed during 1926-29. Lower-case b took the form of a smaller upper-case В in the majority of charts and publications. Four diacritics were used (/-macron for terminal stressed /i/, later carried into Cyrillic; slashed z for /i/, cedillas under с for /j/ and under s for /S/), and one new character (q for /g/, as in the Turkic alphabets). The hiatus (for 'ayn or hamza) was represented by the apostrophe (apostrof). The remaining 31 letters are those commonly available and expected. Despite some inconsistencies in the earliest printed texts, the whole system can be very quickly learned by a reader with some experience of either the Perso-Arabic or the Cyrillic scripts of Tajik. However, in the spelling of words containing an original 'ayn or hamza, and its conflation of long and short vowels, it generated ambiguities that were compounded in the Cyrillic system (see 1.2-3, 1.11-12). The Jews of Central Asia (centered in Bukhara and Samarkand) developed a slightly different version of the Latin alphabet to replace their Hebrew system (see below), and used it until 1935, when they adopted the common Tajik Latin alphabet, replaced by a common Cyrillic alphabet in 1940. For the vowel 0 ö they used Ü u, for the apostrophe when representing £ (Hebrew 'ayiri), Э э, and for H h representing j- (Hebrew heth)9 Ц Ij, since these two sounds had phonemic value in Jewish Tajik dialects. The Hebrew alphabet. Pre-Soviet Judeo-Tajik script, like Arabic, was written from right to left, but differed from Arabic in a number of conventions. The "short" vowels omitted in Arabic were written in full, by means of wow and yod and often the Tiberian diacritics too. Gimel served for both /g/ and /g/ (Arab, gayri) and, with different diacritics, for /C/ and /j/; beth differentiated served for /b/ and /v/, pe differentiated for /p/ and /f/, zayin was both /z/ and /z7. The remaining consonants were as in Hebrew.4 1.10
Cyrillic (1): General
Whereas the Latin writing system was an honest (if flawed) scientific adaptation of an international, largely language-neutral, notation, the Cyrillic alphabets introduced for the Central Asian languages in 1939 4
For a full table and discussion, see Rzehak 1999, pp. 93-95; 2001, pp. 269-70.
36
CHAPTER ONE
(all different, by political intent as much as linguistic punctilio) were unabashedly Russian-specific. A few of the continuing problems will be addressed briefly in the next two sections.5 Italics. The second column of a Russian alphabet is usually devoted to the elaborate "copperplate" handwriting which, in most writers, soon devolves into a more practical hand. The difficulties, such as they are, lie in the ligatures rather than the isolated letters, so that the style is better learned by practice. It is replaced in Fig. 1.10 by the italic printed style based on it (called kursiv), which is widely used and demonstrates most of the same character deviations from the upright style, especially in lower case. The following are salient: г and ғ take on a reverse "s" shape; д becomes an uncial "d"; the three members of the и family are written like a Latin "u"; n comes to resemble a Latin "n", T a Latin "m", and ш a script "w" (additional differences in handwritten Cyrillic involve the letters б, в, and д). Because of the coincidences with Latin w, n, and m, added to the similarities of Cyrillic letters to Latin H, pt с, y, and JC, it is inadvisable to mix Cyrillic and Latin italics; Cyrillic italics will not be used in this book. Alphabetical order. The Soviet-era Tajik Cyrillic alphabet (19401990), in the original order of the 39 letters, is: абвгдеёжзийклмнопрсту фхцчшщъыьэюя *ғӣқӯҳҷ The four letters underlined are Russian-specific characters, introduced in 1953, and dropped from the Tajik alphabet by 1998. Their values are respectively /ts/, /§C/, /i/ (a close central vowel, called еры yery), and the Soft sign (see below). The last six letters, following the asterisk, are those specific to Tajik Persian, originally appended to the Russian alphabet; they comprise the base letters г, и, к, у, х, ч with additional diacritics. These were integrated into the revised alphabet, each after its corresponding base letter (cf. Fig. 1.10), officially in 1998. Revised spelling. Other differences from the Soviet-era alphabet are as follows. Russian ц /ts/ is replaced either by с /s/, even in some Russian names: симент 'cement', Елсин /yelsin/ 'Yel'tsin'; or, intervocalically, by тс/ts/: сотсиалистй 'socialist' (adj.). 5
For more detailed discussion, see Perry 1997; Rzehak 2001, pp. 329-33.
PHONOLOGY AND ORTHOGRAPHY FIG. 1.10
THE TAJIK ALPHABET: CYRILLIC AND LATIN
Cyrillic (1998)
Italic style
A Б В Г Ғ
a б в г ғ
Д Е Ё Ж 3 И
д е ё ж з и
А Б В Г F Д Е Ё Ж 3 И
ӣ Й й К к Қ Л М Н О П
Қ л м н о п
Р С Т У
Р с т у
Ӯ Ӯ ф ф X х
X ҳ Ч ч
Ҷ ҷ Ш ш
ъ Э э Ю ю Я я
37
а б в г г д е ё ж з и й Й й
К Қ Л М Н О
к ц л м н о
П п Р Р С с Т m У У Ф X
У у ф х
Ҳ ҳ Ч ч Я ҷ Ш ш ъ
9 э Юю Я
я
Letter name
Sound
Latin (1929)
а be ve
/a/ 1Ы /v/
8*
/g/ /g/ [Yl /d/ /ye/, Id /yo/ [jo]
A a В в V v Gg
Ее de ye yo
ze ze
/z/
i
/i/ /h/
W
i-izadanok
yot Ice qe le me ne о P re se te
и
п
fe xe he de je Se 'e e yu ya
I'd
D d E e Z z Z z I i I
/q/ /I/ /ml
Q q L 1 Mm N n О о P p R r S s T t U u Ü0
/t/
/u/ /u/[üj /f/ /x/ /h/ /6/
V
J
8 8 d
ye,e
- (S
aojo)
J j К k
/o/ /p/ /r/ /s/
a b
qq
/у/Ш
In/
Transcription
PersoArabic
yo
z j j 1
iS
z i T
(S
У
__j_C
к
о
Я
J
I
с
m n о
Ö
IUT
P r
I—> •г
j
o»
t ß
*
ß
ß
i 1
F f X x
Hh
с
С с
/7 И Id /yu/ [ju] /ya/ (ja]
и и
f X
h
ö
/j/fcfcJ /§/
s
j § § J
(E e) <Juju) (Ja ja)
си
С*
_c 1
s I
e yu ya
38
CHAPTER ONE
The Soft sign ь (in Tajik, аломат-и ҷудоӣ ^ I J ^ d^blü 'separation sign') was formerly placed either after a final consonant to "soften" it (a Russian requirement, irrelevant in Tajik) or between a consonant and one of the Russian yotated vowels to force a consonantal yonset (usually superfluous in Tajik; e.g., дарьё L J J J /daryo/ 'river'—see 1.11, under и). Before the vowel e, however, uncritical excision of the soft sign may have unwanted results in some cases. In Rus. Вьетнам 'Vietnam' it accentuates the /y/ after the initial consonant, but if it were dropped from the Tajik loanword the preceding consonant would prompt the erroneous pronunciation /vetnam/ (see the rules for e, 1.11; the Tajik spelling should remain unchanged or, if revised, would beВиетнам *J» '-yjj> Transcription. The last column of Fig. 1.10 shows transliteration values for (Cyrillic) Tajik. For comparison, words are occasionally transliterated or transcribed as from Standard Persian of Iran; "long" vowels are then written as ä, i9 u, and "short" vowels as a, e, o.
1.11
Cyrillic (2): Consonants
There is only one Cyrillic character used to represent a Tajik consonant which presents any inconsistencies (apart from the semi-vowel й /у/; see 1.13). This is the glottal stop sign (ъ, corresponding to the Russian Hard sign), called аломати сакта < " j ^ /yertiS/ (Uz.) 'piece of cloth given to mourners as a memento', ёр j L /yor/ 'friend, comrade', яла «dl /yalä/ 'open', гӯяд s^jZ /guyad/ 'says', юндӣ /yundi/ 'dishwater'. y
In the sound sequence /io/, /i o/ the second vowel is usually written not as о but ё: сиёҳ 'black', арманиён 'Armenians' (sg. арманй + y y -он); similarly, /ia/, /i a/ is spelled with я, and /iu/, /i u/ with ю: Булғория 'Bulgaria', иқтисодию иҷтимой 'economic and social' (4.11; see also next paragraph, and below under И и). In Russian, when one of these vowels (or и; see below) begins a syllable following a consonant, it is buffered with ь the Soft sign; this practice was followed in Tajik until discontinued in 1998, e.g., дуньё > дунё L-iJij /dunyö/ 'the world', афьюн > афюн ^ j j a I /afyün/ 'opium'. Where prosodic and morphological boundaries do not coincide, a yotated vowel may disguise a Tajik stem and/ or suffix: cf. ояд JJT o-y-ad/oyad/ or биёяд ^ Ь - J bi-o-y-ad 1Ь\у6угй1 'let him come', where the prefixed variant forces the use of a different character (ё) to represent the same stem /o/, in addition to a character (я) that merges a facultative buffer with the standard suffix /ad/ (see 3.4—5). The use of yotated vowels had, and has, other repercussions for Tajik orthography: E e stands for Id after a consonant, but /ye/ word-initially or after a vowel: мебинем /mebinem/ 'we see'; мегӯем л i J J /megiiyem/ 'we say'; after a low vowel, the /y/ may be a less prominent glide: дароед! ^ Т j j /däro^d/ 'come in!'. (SP i, yi). Э э (transcribed xe) stands only for /e/, and must therefore begin a word or syllable with this initial value: элак lelak 'sieve', эҳтиёт /ehtiyot/'caution, prudence', боэҳтиёт JOLJ-JL^J LJ/böehtiyot/ 'cautious, prudent' (lit. 'with prudence'); эзсгиром 'respect', беэҳтиром f i j ~ ~i ^ /beehtirom/ 'disrespectful' ('without respect'). Compounds are written as one word in Cyrillic (1.15), so the juxtaposition of the two different e's is a natural consequence of the redundant Russian-specific feature. (SP i). Й й /у/, a consonant or semi-vowel, in addition to its Russian name yot, is known more formally in Tajik as и-и кӯтоҳ öLij £ ^1 'short i \ Thanks to the convention of yotated vowels, it is found
PHONOLOGY AND ORTHOGRAPHY
41
word- or syllable-initially only before the vowels и and ӯ, and exceptionally before о: йигит /yigft/ 'youth, young man, (daring) horseman'; йӯрға /yurgä/ 'trot'; район Irayonl 'region', майор /mayor/ 'major', йод /yod/ 'iodine' (the combination йо is found only in a few foreign loanwords in Russian carried over into Tajik; contrast the Persian word ёд JLJ /yod/ 'memory'). The combination -йӯм may be found as the ordinal number suffix in older texts, but was judged too pedantic and replaced by -юм /-yum/ (2.52). Geminate /yy/ is always written with a sequence of two distinct graphs, й plus the appropriate yotated vowel: тайёр /tayyör/ 'ready', айём /ayyöm/ 'days (of yore)', муайян /muayyän/ 'fixed, designated'. The vowel и also counts as yotated for this purpose (see below): мутаҳайир /mutahayyir/ 'astonished'; й is thus never written double. И и I'll is a hybrid, and a more subtle joker than e. Usually it represents Tajik /i/ initially or internally: ин зин ^ j ^ 1 /in zin/ 'this saddle'. It is followed by a yotated vowel in the sequences /io/ иё, /ia/ ия, /iu/ ию (see above; SP e, ye, i, yi). It is not classed as a yotated vowel, but does "soften" (palatalize) most preceding consonants (in Russian), while in Tajik it assimilates a preceding terminal /у/: наистон ^1 " ... /»/nayiston/ 'reed bed, cane brake' (nayiston, see 5.2). This use of и for /yi/ occurs most frequently in izofat phrases: рӯи зан ^ j ^ j j /riiyi zan/ 'the woman's face', бар рӯи миз у^ ^ j j j-> 'upon the table' (both j ^ 'let us weep', as distinct from гирем f->j \ ^ /girem/ 'let us get'; morphologically, the word is giry-em. (2) After the Hard sign, ъ, when и alone represents /yi/: таъин i>: : *'~Jt**y{nf 'designation, appointment' (cf. меъёр j L u L - e /me'yor/ 'standard, criterion'). (3) When geminate у is followed by i: мутаҳайир j j ^ ~ ~ /mutahayyir/ 'astonished' (и as /yi/; cf. yotated vowels providing the second /y/ in the sequence /yy/ in муайян, тайёр under й above). Word-initial и is found before a (yotated) vowel in a number of Russian loans that originated in Greek or Latin: июн(ь) /iyun/ 'June', июл(ь) /iyul/ 'July', иероглиф /iyeröglif/ 'hieroglyph'. It is followed by о in a few transferred loanwords such as радио 'radio' (Taj. /radio/ or /radio/). The letter ӣ (with macron; called и-и заданок i£L>6jj ^ - 'stressed V or и-и дароз j\jA{£~ c ' o n g О is a device to distinguish accented word-final -i from unstressed final -i, which occurs only (but frequently) as the syntactic izofat enclitic: e.g., дӯсти ман <J-Ö о-ujjj 'my friend' (2.10). Stressed final -7(SP 0 occurs under the following conditions: (1) As an integral part of a noun: моҳй LS-^\ о /mohi/ 'fish', таксй ^-»
JL
s
JL
>
j A
A
A
J A
A -•JW
c^a-
-a-
-o
Jo. Ja.
-i. Ja.
Jo
Ja.
t t
«J
<JIS"
^ ? ^
С:
Ji. Си*
A
J
J*
to
Л —
J1
zod
lorn mim nun vov
II
L
О
A
i£ sin Sin sod
fe qof kof gof
1
(L)
pe (b) te se (-1s*? nuqta) jim ce he (-i hutti) xe
Initial vowel; Ъ
CHARACTERS
J, Ji ^ -ft
J J
t^
J
J-
J-
f
r
•*•
J
0 J •
J. A.
er
^ -
Jb
PHONOLOGY AND ORTHOGRAPHY 1.14
47
Perso-Arabic (2): Vowels
There remain five vowel diacritics, and four consonants which, besides their consonantal roles, play a part in specifying the vowels of Tajik and otherwise regulating the orthography. Vowel diacritics: The three conventionally "short" vowels of Persian, corresponding to /a/, some of /i/ and /e/, and some of /u/ and /ü/ in Tajik (see 1.3), are not usually written except to disambiguate homographs and to specify vowels in dictionaries and grammars. For instance, the consonantal matrix j j - ^ may stand for any of the three words jjj? gard 'dust', jj_2 'round', and jj-S gurd 'hero', which may not be identifiable even in context without the diacritics. These will accordingly be written in this grammar to assist in word recognition and especially to make explicit the izofat and other grammatical relationships. Collectively called CL»\ £ j a> harakot, lit. 'movements', they are placed directly above or below any consonant except the terminal (but see zer): _ (called zabar 'above'), represents /a/: v-JJ* talab 'demand', u ,,1 Ui atlas 'satin'. It should not be confused with the longer, horizontal mad{d) placed over aft/(see below). It may also indicate /a/ as the first component of a diphthong, when followed by ^ or j : ьл j л mayda 'small', jjh savr /sawr/ 'third month of the Iranian solar year' (see 1.4, Diphthongs). | _ I (zer 'below'), shaped like zabar, represents short /i/, or Id as an allophone of this before /h/ or /7 (see 1.4, Lowering): ъ j j zireh 'armor', «.г.» \ *% о mehnat 'toil, labor', CL>JL*J ijozat 'permission', J*-&fe'l 'verb'. When used under the last consonant in a word, it indicates the izofat enclitic: «j ro/ito 'gone', <J»UL лчэиа
'house, home' come respectively o < л ^u 1 jcotfagi'fatigue', <jK" «j rafiagon 'the departed', ^-S^L^L 'domestic, household' (adj.). Such derivatives are usually written as a single word, but those in -gl in partcular may also be found as etc. (see 3.44-^6). The Indefinite-Specific enclitic -e, and its homograph the Relative enclitic -e, which have an unvarying vowel onset, are written with alif and ye. ^1 \ ^^-ÄI—A mohi-i bahr /mohPi bahr/ 'sea fish', jJJa-A ^ \\ neki-i mutlaq 'absolute good'. In older texts, izofat after final ^ may be written with a miniature ^ above it (or, if printed, a hamza): Jjlln n ^ j ' 1 (3) In many nouns ending in -o (e.g., L^yo 'place'), and most of those in -й (like J-A тй 'hair'), the canonical form actually ends in consonantal -y /y/: ^L^joy, ^j—А тпу (see 1.8). Accordingly the addition of zer to the semi-consonant in these cases is the same procedure as adding it to the final consonant of a regular noun, e.g., Л > Л JLJ 'cold wind'. It follows that an explicit zer can disambiguate an izofat phrase from a Possessive (exocentric) compound (5.8), where ^ alone would be insufficient to indicate the presence of izofat: j j \ ,., ^ J - A 'white hair', but л j a 'H^J-A тпу-safed 'old man' (lit. 'white-haired [one]'; cf. 2.3). Ye for aliflol: After a noun ending in alifmaqsura, i.e., ^ representing the vowel /o/ (1.13), the ye is changed into an alif and /zo/af is shown obligatorily by means of ^ , as above: ^ \% 0 ma'no gives a. ^ L U Ä о /ma^ö 5 ! jumlä/ 'the meaning of the sentence'; ^ . . i j r Ш? gives ^rjn, л ^ I M . J C /is^i masih/ 'Jesus the Messiah'. Hamza *: If ^ is written at the end of a word, it is treated as a consonant; either zer is added, or nothing: U^J^J *l »Л J *"mz